jenskrumsieck / bundestag-project Goto Github PK

View Code? Open in Web Editor NEW

I took a machine learning in python course recently and wanted to practice what i have learnt in this course. This repository contains the progress i made in two days with scraping polls from the website of the german Bundestag.

Jupyter Notebook 44.52% Python 55.48%

bundestag-project's Introduction

Bundestag Polls

The poll data was scraped using code in the scraper subfolder and stored in csv files in the data subfolder. The files are named as follows: [Voting Period]_data.csv containing columns for each poll and are named with the following scheme: [Period]-[Session]-[Poll]

Jupyter Notebook index.ipynb contains an attempt to do classfication with supervised machine learning using a Pipeling and GridSearchCV. Best Values for 19th Bundestag (Score: 76.78%):

{'classifier__knn__n_neighbors': 3, 'classifier__pca__n_components': 4}

Pipeline:

vote_cols = [c for c in df.columns if "-" in c]
Pipeline(steps=[('preprocess',
                 ColumnTransformer(sparse_threshold=0,
                                   transformers=[('preprocess_vote',
                                                  Pipeline(steps=[('imputer',
                                                                   SimpleImputer(fill_value='Abwesend',
                                                                                 strategy='constant')),
                                                                  ('onehot',
                                                                   OneHotEncoder(handle_unknown='ignore'))]),
                                                  vote_cols)])),
                ('classifier',
                 Pipeline(steps=[('pca', PCA()),
                                 ('knn', KNeighborsClassifier())]))])

Result

The major take-away from this project for me was that you can clearly see in the Visualizations who is the governing coalition in each period and the obligation to vote in accordance with party policy. The classification is not very useful as for yourself to classify a lot of polls have to be taken - although this would be a nice idea for further development.

jenskrumsieck / bundestag-project Goto Github PK

bundestag-project's Introduction

Bundestag Polls

Pipeline:

Result

19th Bundestag:

18th Bundestag:

17th Bundestag:

20th Bundestag (current):

combined Bundestags (does that even make sense? 😉)

bundestag-project's People

Contributors

Stargazers

Watchers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent