Digital Data Analysis, Report 3

This GitHub repository contains all the codes that have been used during the Report n.3 of the Digital Data Analysis course. The codes are described in the following lines following the natural workflow that has guided the writing session of the Report n.3 too.

1. MachineLearningUtility.ipynb

In this notebook a single plot has been presented, showing the reason why a Machine Learning approach has been followed

2.PCA.ipynb

The Principal Component Analysis approach has been performed and described. The curios case of P.C.A. explainability has been deepened here and in this article: https://towardsdatascience.com/p-c-a-meets-explainability-ba1ba5e4636

3. Linear. ipynb

Four dataset has been used to develop a Support Vector Machine linear algorithm:

P.C.A. most informative features dataset
P.C.A. dataset and other features dataset
Original dataset
Most informative features original dataset

4.NonLinearTwoComponents.ipynb

Stated that P.C.A. dataset performs better, a non linear Support Vector Machine Algorithm has been applied

5.NonLinearThreeComponentsTwoC.ipynb

The three component P.C.A. dataset has been here used to perform a non linear Support Vector Machine Algorithm for two classes classification

6. NonLinearThreeComponentsThreeC.ipynb

The three component P.C.A. dataset has been here used to perform a non linear Support Vector Machine Algorithm for three classes classification

7.DecisionTree.ipynb

In this notebook, the decision Tree and Random Forest approach has been used to perform a three classes classification algorithm, hyperparameters tuning has been made in order to get the best random forest algorithm

8. Classification.ipynb

The best hypertuned SVM algorithm coming from the previous notebooks has been used to perform a first classification. Then this classification has been bosted by performing the best Random Forest algorithm from the previous notebook, that has been applied in the most wrong prediction area of the Support Vector Machine algorithm. A total of 82% of accuracy has been obtained.

pieropaialungaai / report3 Goto Github PK

report3's Introduction

Digital Data Analysis, Report 3

1. MachineLearningUtility.ipynb

2.PCA.ipynb

3. Linear. ipynb

4.NonLinearTwoComponents.ipynb

5.NonLinearThreeComponentsTwoC.ipynb

6. NonLinearThreeComponentsThreeC.ipynb

7.DecisionTree.ipynb

8. Classification.ipynb

In addiction:

Report3_group1.pdf

report3's People

Contributors

Stargazers

Watchers

Forkers

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent