PIC16A-Twitter-API
Made by Melina Diaz, Tracy Charles, Michelle Pang
Note: We provided a uclatweets.csv
in case someone wanted to use functions from analysis.py
but didn't have a bearer_token in order to run make_call(...)
from making_dataframe
which makes a CSV
- Project name: UCLA's Twitter API Project
- Names of group members (if you don’t want to for privacy, add usernames): Melina Diaz, Tracy Charles, Michelle Pang
- Short description of the project: Using the Twitter API, we made a dataframe of tweets about UCLA and made inferences about the tweets using clusters and sentiment analysis to determine what is currently trending around UCLA and if the general consensus are positive or negative. Additionally, the Twitter API allows us to extract more details about the original tweet such as the number of retweets, followers and likes which hints to the credibility and influence of the tweet.
- Instructions on how to install the package requirements. Firstly, clone the repository and use
conda create --name NEWENV --file requirements.txt
to download packages used in 'making_dataframe.py' and 'analysis.py' files. Then, open the 'demo.ipynb' file and run all codes to generate the results as shown in the next point. - Detailed description of the demo file (how to run it, what output one should expect to see, and explanations of result):
demo.ipynb
is the end-to-end demo.
- The first cell imports
making_dataframe.py
andanalysis.py
- Second cell creates a dataframe by calling the API with your bearer token and a keyword. It creates a JSON and CSV file named
filename
that has max_results number of tweets (or less if not enough are available). - Third cell will create an obj of our custom class. We created 3 functions which will output recent tweets with the highest popularity scores, topic clusters, and a bar graph and word clouds from the sentiment and words used in tweets. This can be seen in cells 5, 6, 7. Note that the outputs will depend on the dataset, and the dataset changes based on the most recent tweets. The word cloud function also generates different word clouds each run, so it is expected for every user to get different outputs. Here are the examples of the four visualizations.
- Cells 8, 9, 10 are showing the use of exceptions. The exceptions allow the code to run properly without being halted by errors.
- Scope and limitations, including ethical implications, accessibility concerns, and ideas for potential extensions: Although this package is publicly available, users who wish to access the tweets will require a Bearer Token that can be created through the Twitter API developer portal. Some possible extensions include any topic that preferably has a lot of public consensus such as public figures, politics, weather conditions etc.
- License and terms of use (probably MIT license). We obtained the dataset using the Twitter API. The Twitter Development Agreement and Policy can be found here: https://developer.twitter.com/en/developer-terms/agreement-and-policy. Our code is reproducible and open to the general public.
- References and acknowledgement: Twitter API developer platform
- Background and source of the dataset: We generated this dataset, which consists of the most recent tweets about UCLA
- Links to any tutorials you used, and at least 3 specific things you implemented that differentiates your project from what’s already in the tutorial:
- https://towardsdatascience.com/an-extensive-guide-to-collecting-tweets-from-twitter-api-v2-for-academic-research-using-python-3-518fcb71df2a This tutorial shoed how to collect tweets with Twitter's API and form a dataframe, but we didn't like the way the user had to run the functions individually, so we consolidated the function calls. We also didn't like the columns included, so we rewrote how the JSON is turned into a cleaned CSV. We also had to change the keywords and parameters to make the dataframe UCLA-specific, which we learned from https://developer.twitter.com/en/docs/twitter-api/tweets/search/integrate/build-a-query