favorable-candidate-prediction's Introduction

Election

This is the working directory for our project.

explaining all files ---

Images output of result_visualization.py . Contains all images as output for visualizations ran on labelled files
bingnews.py script to scrape news articles . (COMMENTS NEED TO BE ADDED)
BJP_Labelled.csv and INC_Labelled.csv output of vaderSentiment.py . Contains the BJP and INC dataset labelled POS,NEG,NEU by the program.
Logistic Regression , SVM , RandomForest , XGBoost,Naive bayes are the programs for the respective ML models. comments have been added to Logistic Regression file for understanding. other files have almost similar codes.
preprocess.py preprocessing and cleaning of data for the ML models . comments added
procBJPtweets.csv and procINCtweets.csv are the datasets for BJP and INC tweets . (PS - More INC tweets needed.. not enough )
result_visualizatiojn.py visualtions done on BJP_Labelled.csv and INC_Labelled.csv . preprocessing included . Comments added.
train.csv 60k tweets for training all ML models . small part of the 16 million tweets dataset uploaded in Google drive in the TRAINING DATASET folder
tweets.py
script to fetch twitter data
vaderSentiment.py classifies tweets in dataset as POS,NEG,NEU using VADER lexicon based approach
visual.py contains visualization done on train.csv . ##TEST VISUALIZATIONS##
data.csv dataset required for naive_bayes.py contains 40.5k tweets( equal ratio of pos and neg
pred.py It is consolidated code with various graphical representations. Uses LS_2.0.csv as dataset.
LS_2.0.csv Dataset for kaggle.py
accuracy.py compares the accuracy of predicted winners with the actual winners.
Winners.csv Consolidated dataset with all winners of karnataka 2019 elections.

Recommend Projects