Modelized and predicted the income level for a person per year , i.e which people save more or less than $50,000 / year. I started with a basic algorithm: logistic regression, then added the SMOTE technique to deal with the high imbalance in the data. I tried other algorithms based on decision trees ( such as Ada boost, Random Forest, Gradient boosting) , and Naive Bayes classifier. To assess and compare the performance of the classifiers, I used precision, recall, ROC auc, accuracy and confusion matrix .
yesminebellalah / benchmarking-study-of-income-classification-with-regards-to-data-collected-in-2010-us-census Goto Github PK
View Code? Open in Web Editor NEWModelized and predicted the income level for a person per year , i.e which people save more or less than $50,000 / year. I started with a basic algorithm: logistic regression, then added the SMOTE technique to deal with the high imbalance in the data. I tried other algorithms based on decision trees ( such as Ada boost, Random Forest, Gradient boosting) , and Naive Bayes classifier. To assess and compare the performance of the classifiers, I used precision, recall, ROC auc, accuracy and confusion matrix .