Coder Social home page Coder Social logo

chatalysis's People

Contributors

jdjfisher avatar miskfi avatar stepva avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

chatalysis's Issues

Add support for profile pics to the bundled app

The standard (python files run from command line) distribution allows adding profile pictures to the output, if the users adds the picture to a certain folder. We need a way to allow for this in the bundled app - probably have the user select a folder with the profile pictures just like he selects the folder with the messages. I would probably like to combine it with #88 and include this in the settings menu.

Word clouds

Idea - add word clouds charts to chatalysis

Analyze "text emojis"

Currently, the emoji shown in the output stats are pure Unicode emoji (the ones sent via the emoji selection). Text emoji such as :D or :( which are shown as emoji in Messenger, stay in their original string form in the input JSONs, and are therefore not counted.

Logo buttons

It would be nice if the buttons in the source selection menu had logos instead of text.

Settings to choose what to include in output

Simple settings section in GUI with options like wordclouds (when I finish them :D), nickname plots, emojis etc, and a user could turn them on and off (would save in config?) and then based on those the final output html would be generated, with some sections excluded if user doesn't want them.
Reason - the output might get too long, the wordclouds for example might be useless for some chats (those could be False by default), also this way we could easily include plenty of other stats (like the message lengths, which would be optional as well)

Improve files importing

change the logic, don't look for folders starting with "messages", maybe try to identify all inbox folders straight away independently of the parent folder structure. Sometimes facebook downloads it weirdly: my friend had to download 5 folders, 3 names messages and 2 named "facebook-Something" inside which was another messages folder - and we needed all 5 of those

Basic GUI

Something very basic, just to test it, will probably be better than command line interface

Decide on code formatting

I'm not a big fan of the formatting changes, especially the if statements (for example in the method create()). Ideally keep formatting as was and then agree on some code style and formatting standard - this applies just to the changes made in GUI.py

Originally posted by @miskfi in #11 (review)

Investigate citing/quoting in messenger

recently (at least in my desktop messenger app), when one starts a message with ">" it reads the rest of the message as a quote. What does this look like in the data?
Especially concerning ">:(" emoji.
Look into our chatalysis chat

Even more stats ideas

  • nicks and groupchat names
  • length of voice messages
  • tags (@Štěpán Vácha etc) - could be tricky because can @"nickname" or @FirstNameOnly

Auto complete box - improve usability

  • don't focus on whole window when arrow down, go straight to first name
  • if only one name left, enter from the name box will by default select that name instead of trying to go further with the unfinished name in the name box
  • scroll bar

Analysis speedup

The (Facebook Messenger) analysis is currently pretty slow, we can try some ways to speed it up (multithreading, optimizing the analysis algorithm, using Cython).

Distribute executables via Pyinstaller

Running Chatalysis currently requires the use of the terminal, which isn't something that many people know how to use. Instead, we could distribute bundled executables from Pyinstaller that will run Chatalysis simply by clicking on them without the need to use the terminal or install anything.

Global stats

We could add another output option - personal global stats. It would show the the stats (number of messages, gifs etc., histrograms, emoji stats) for your messages from all available conversations. The downside is that it would probably be quite slow.

Improve input format for individual conversations

Change the input format for individual conversations from the "condensed" (i. e. "johnsmith") one used in the folder structure to a normal one (i. e. "John Smith"). A drop down list of all available conversations would be a nice addition as well.

Nicer GUI (especially on Linux)

The GUI looks quite ugly and dated on Linux (and Windows to some degree) - we could experiment with Styles or ttk Themes to make it nicer (see this). However, on macOS the GUI looks pretty nice, so I would like to keep it intact if possible.

Analyze Telegram messages

Add support for analyzing Telegram messages (which can be exported into JSON). It would be nice to convert the code into OOP for a more modular approach along with this.

Set up automatic testing

So far, we've been testing everything manually and it's becoming very tedious. Set up some automatic testing - either some unit tests that will run locally with our personal data or CI testing using some dummy data.

Fix alignment in Top conversations

Sometimes the output from tabulate isn't aligned and I can't seem to figure out why. It'll be easier to just write my own function for creating text "tables".

Incorrect (weird) emojis in group chat after latest data download

Only with latest data, these weird colored squares are showing in emoji stats, probably some special emojis (or something else) being parsed incorrectly. Not related to any recent changes in code.

image

could be worked on together with #31 , looks like the analysis of emojis needs a bigger rework :(

Template refactor

Merge the HTML templates together so that we don't have 20 different templates. Will make editing templates much easier.

Rework the logic of getting jsons, messages and other info to correctly support downloading additional data

Chatalysis supports downloading additional messages for a new time interval when user already has downloaded some messages previously. However, this does not work correctly:

  • if groupchat changes a name, the jsons will not be combined. But it can be achieved because the latter part of chat_id stays the same (eg. 8boduxf_zb2jbkk66a and 9boduxf_zb2jbkk66a)
  • getting participants and title etc. also depends on the file from which it is taken, now it assumes that it's the same in all jsons, which is correct when a user downloads everything at once, but when adding more data after some time this assumption fails. This can be fixed by taking the data from the most recently modified file (os.path.getmtime(x))
  • unfortunately it includes also having to rework gui box for selecting conversations (to only include latest conversation names for example) and other parts of the process
  • after that is done correctly, we can rework processing messages - ideally I would have "pariticipants" to be the current participants, but for total stats we also need stats from the other names, but without all the extra if statements in the for loop. That will be done easily once the stuff above is handled

General ideas list

  • length of voice messages
  • tags (@Štěpán Vácha etc) - could be tricky because can @"nickname" or @FirstNameOnly
  • groupchat photos? (see #53)
  • average message length and length of the longest message (code ready, see #45)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.