The goal of this project was to implement and evaluate supervised learning models using the Taiwan Bankruptcy Data. We created models for the prediction/classification and compared the metrics between the different models to determine the ones that performed better.
Models implemented:
- Logistic regression
- NuSVC
- BernoulliNB
- AdaBoostClassifier
- Linear Discriminant Analysis
- Descriptive analytics, data cleaning and formatting
2.Data preparation for machine learning and Feature selection:
- verification of correlation and
- reducing the number of variables,
- feature selection using VarianceThreshold , XGBoost, Kbest and Recursive Feature
- oversampling
- data scaling
- splitting the dataset (train/test)
- Model research.
- Logistic regression
- NuSVC
- BernoulliNB
- AdaBoostClassifier
- Linear Discriminant Analysis
-
Implemented the models on our data
-
Hyperparameters tuning
-
Comparing the results using metrics:
- accuracy
- recall
- precision
- ROC_AUC score
- plot ROC_AUC curve