This repository contains the code for three main methods in Machine Learning for Feature Selection i.e. Filter Methods, Wrapper Methods and Embedded Methods. All code is written in Python 3.
1. Python 3.5 +
2. Jupyter Notebook
3. Scikit-Learn
4. Numpy [+mkl for Windows]
5. Pandas
6. Matplotlib
7. Seaborn
1. Santander Customer Satisfaction Dataset
2. BNP Paribas Cardif Claims Management Dataset
S.No. | Name | About | Status |
---|---|---|---|
1. | Constant Feature Elimination | This notebook explains how to remove the constant features during pre-processing step. | Completed |
2. | Quasi-Constant Feature Elimination | This notebook explains how to get the Quasi-Constant features and remove them during pre-processing. | Completed |
3. | Duplicate Features Elimination | This notebook explains how to find the duplicate features in a dataset and remove them. | Completed |
4. | Correlation | This notebook explains how to get the correlation between features and between features and target and choose the best features. | Completed |
5. | Machine Learning Pipeline | This notebook explains how to use all the above methods in a ML pipeline with performance comparison. | Completed |
6. | Mutual Information | This notebook explains the concept of Mutual Information using classification and Regression to find the best features from a dataset. | Completed |
7. | Fisher Score Chi Square | This notebook explains the concept of Fisher Score chi2 for feature selection. | Completed |
8. | Univariate Feature Selection | This notebook explains the concept of Univariate Feature Selection using Classification and Regression. | Completed |
9. | Univariate ROC/AUC/MSE | Ongoing |
S.No. | Name | About | Status |
---|
S.No. | Name | About | Status |
---|