Coder Social home page Coder Social logo

shree-shubham / elections Goto Github PK

View Code? Open in Web Editor NEW
1.0 2.0 0.0 102.44 MB

Using twitter data to show opinions of presidential candidates

Home Page: https://elections.cubigdata.com/

Python 40.80% Clojure 3.80% Go 15.26% HTML 7.27% JavaScript 11.54% CSS 21.32%

elections's Introduction

Elections

Using twitter data to predict primary election results.

Gathering Data

Setting up MongoDB ( AWS - Ubuntu)

Follow the steps to setup MongoDB on an Ubuntu machine.

sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 7F0CEB10
echo "deb http://repo.mongodb.org/apt/ubuntu trusty/mongodb-org/3.0 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.0.list
sudo apt-get update
sudo apt-get install -y mongodb-org

On a Mac:

brew update
brew install mongodb

Starting MongoDB:

sudo service mongod start

Stopping MongoDB:

sudo service mongod stop

Installing tweepy and pymongo

pip install tweepy
pip install pymongo

Setting the environment variables

In the ~/.bashrc file (ubuntu) or ~/.bash_profile file (mac), enter the following:

export TWITTER_CONSUMER_KEY='your key'
export TWITTER_CONSUMER_KEY_SECRET_0='your key'
export TWITTER_ACCESS_TOKEN_KEY='your key'
export TWITTER_ACCESS_TOKEN_KEY_SECRET_0='your key'
export PYTHONPATH="Location of your folder (in this case the location of the folder 'Elections' ":$PYTHONPATH

then either load a new prompt or run

source ~/.bashrc or source ~/.bash_profile

Tmux

A tmux session named collect_twitter_data has been created. It will be running in the background forever until terminated explicitly. Currently extract-tweets.py is running on the AWS instance and it is collecting data. To view what's happening you should 'attach' to that session. Basically, ssh to our instance from the terminal and run the command:

tmux attach -t collect_twitter_data

Then you can use it as an usual ubuntu terminal.

For more info on tmux - https://robots.thoughtbot.com/a-tmux-crash-course

Real-time analytics using streamparse

Streamparse lets you run Python code against real-time streams of data via Apache Storm. With streamparse you can create Storm bolts and spouts in Python without having to write a single line of Java. It also provides handy CLI utilities for managing Storm clusters and projects.

Quickstart with streamparse

With no need for any coding in Java, streamparse lets you get started right away with a basic wordcount topology.

Important: The version of lein installed should be 2.x. If not, upgrade it by following this.

Sentiment Analysis

For sentiment analysis we are using a Naive Bayes classifier from ntlk. The training data currently used is a set of 800,000 tagged tweets.

All examples are from within the sentiment directory.

Training a classifier

Trim and keep all 800,000 tweets

$ python3 trim.py

Trim to 10,000 tweets (faster for testing, use any number up to 800,000)

$ python3 trim.py training_set.csv 10000

Train the classifier (need the trimmed.csv file from above)

$ python3 trainer.py

Train the classifier, with 10 iterations over each set (no accuracy improvement, but gives you a better basis for comparison if there are more trials)

$ python3 trainer.py 10

Using a classifier

The most accurate classifier from training will be saved at the end and that classifier is loaded here. Edit your config.json to use a different classifier.

To test using the interactive cli

$ python3 cli.py
? this is awesome!
pos
? this sucks
neg

You can also use the Analyzer class provided for use within another program. See the cli.py file for an example.

Running a python cron job

Run crontab -e and insert the following snippet inside the file.

*/2 * * * * /absolute_file_path/generate_wordcloud.py

This runs every 2 minutes. PS: Change the absolute paths inside generate_wordcloud.py

A useful link: http://www.thesitewizard.com/general/set-cron-job.shtml

elections's People

Contributors

biw avatar mguida22 avatar narain280493 avatar rddcodes avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.