stepva / chatalysis Goto Github PK
View Code? Open in Web Editor NEWAnalyse and visualise your chats - currently supporting Facebook Messenger and Instagram
License: MIT License
Analyse and visualise your chats - currently supporting Facebook Messenger and Instagram
License: MIT License
Instead of taking it from dir.
Resolve when we have a domain for hosting stuff
The standard (python files run from command line) distribution allows adding profile pictures to the output, if the users adds the picture to a certain folder. We need a way to allow for this in the bundled app - probably have the user select a folder with the profile pictures just like he selects the folder with the messages. I would probably like to combine it with #88 and include this in the settings menu.
OS-specific issue, / vs , mainly in jinja template for chart.js and other paths.
Idea - add word clouds charts to chatalysis
When you press the same button twice, it opens a second window with the same content which cannot be closed (trying to close the window causes an error). Fix this, so that only one instance of the same type of window can be open at the same time.
Currently, the emoji shown in the output stats are pure Unicode emoji (the ones sent via the emoji selection). Text emoji such as :D or :( which are shown as emoji in Messenger, stay in their original string form in the input JSONs, and are therefore not counted.
We could add some basic stats about message contents such as average message length and length of the longest message.
It would be nice if the buttons in the source selection menu had logos instead of text.
Simple settings section in GUI with options like wordclouds (when I finish them :D), nickname plots, emojis etc, and a user could turn them on and off (would save in config?) and then based on those the final output html would be generated, with some sections excluded if user doesn't want them.
Reason - the output might get too long, the wordclouds for example might be useless for some chats (those could be False by default), also this way we could easily include plenty of other stats (like the message lengths, which would be optional as well)
Look into adding support for WhatsApp. https://github.com/orkestral/venom might be useful.
change the logic, don't look for folders starting with "messages", maybe try to identify all inbox folders straight away independently of the parent folder structure. Sometimes facebook downloads it weirdly: my friend had to download 5 folders, 3 names messages and 2 named "facebook-Something" inside which was another messages folder - and we needed all 5 of those
Something very basic, just to test it, will probably be better than command line interface
in chatalysis.py, htmllyse() on lines 29-31
if os.path.exists(file):
wb.open(file)
return
would be good to somehow check for the timestamp of the creation of the file, otherwise when doing it again after a while with new data, the analysis is not run fresh
I'm not a big fan of the formatting changes, especially the if statements (for example in the method create()
). Ideally keep formatting as was and then agree on some code style and formatting standard - this applies just to the changes made in GUI.py
Originally posted by @miskfi in #11 (review)
recently (at least in my desktop messenger app), when one starts a message with ">" it reads the rest of the message as a quote. What does this look like in the data?
Especially concerning ">:(" emoji.
Look into our chatalysis chat
We have some "draft logo" but a proper logo / icon would be awesome.
The personal stats analysis can take pretty long (tens of seconds) and during that time the program just appears frozen. Add a progress bar to indicate that something's really happening.
See ttk.Progressbar & tqdm.tk
Might be easier - don't need npm, Plotly charts are interactive...
Probably an issue with Safari?
0:97: execution error: Soubor nějaký objekt nebyl nalezen. (-43)
69:77: execution error: „application "chrome"“ nelze načíst. (-1728)
The (Facebook Messenger) analysis is currently pretty slow, we can try some ways to speed it up (multithreading, optimizing the analysis algorithm, using Cython).
Running Chatalysis currently requires the use of the terminal, which isn't something that many people know how to use. Instead, we could distribute bundled executables from Pyinstaller that will run Chatalysis simply by clicking on them without the need to use the terminal or install anything.
We could add another output option - personal global stats. It would show the the stats (number of messages, gifs etc., histrograms, emoji stats) for your messages from all available conversations. The downside is that it would probably be quite slow.
Chatalysis uses the library emoji on version 0.6.0 and needs to be upgraded for a new version
https://pypi.org/project/emoji/1.2.0/#description
works fine when ran not built
Change the input format for individual conversations from the "condensed" (i. e. "johnsmith") one used in the folder structure to a normal one (i. e. "John Smith"). A drop down list of all available conversations would be a nice addition as well.
@stepva would it be possible to put the logo of the message source somewhere in the template (top right/left corner for example)?
The GUI looks quite ugly and dated on Linux (and Windows to some degree) - we could experiment with Styles or ttk Themes to make it nicer (see this). However, on macOS the GUI looks pretty nice, so I would like to keep it intact if possible.
Add top N most popular messages (messages with most reactions) in a group chat to the output stats
Messages pie chart already only shows top 10 + others, make it similar for individual stats instead of showing everyone, that makes the file not usable at all and way too big
test case for myself - bisar group chat
Add support for analyzing Telegram messages (which can be exported into JSON). It would be nice to convert the code into OOP for a more modular approach along with this.
So far, we've been testing everything manually and it's becoming very tedious. Set up some automatic testing - either some unit tests that will run locally with our personal data or CI testing using some dummy data.
Sometimes the output from tabulate isn't aligned and I can't seem to figure out why. It'll be easier to just write my own function for creating text "tables".
Once you select a source, you can't go back and change it. Add "go back" button that leads to the source selection menu.
Only with latest data, these weird colored squares are showing in emoji stats, probably some special emojis (or something else) being parsed incorrectly. Not related to any recent changes in code.
could be worked on together with #31 , looks like the analysis of emojis needs a bigger rework :(
rip instalysis 2021-2021
Merge the HTML templates together so that we don't have 20 different templates. Will make editing templates much easier.
Chatalysis supports downloading additional messages for a new time interval when user already has downloaded some messages previously. However, this does not work correctly:
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.