Thyroid-Disease-Detection
Thyroid disease a very common problem in India, more than one crore people are suffering with the disease every year. Thyroid disorder can speed up or slow down the metabolism of the body.
The main objective of this project is to predict if a person is having 'compensated_hypothyroid', 'negative' (no thyroid), 'primary_hypothyroid', 'secondary_hypothyroid' with the help of Machine Learning Classification algorithms such as XG Boost, Random Forest, Decision Tree, K-Nearest Neighbors have been trained on the thyroid dataset, UCI Machine Learning repository.
Random Forest performed well with better accuracy (99.16%), precision and recall. ExtraTreesClassifier algorithm is used for feature selection and feature importance.
Technical aspect
Python 3.7
Front-end: HTML
Back-end: Flask framework
IDE: Jupyter Notebook, PyCharm
Deployment: AWS
Data Collection
The dataset is collected from UCI Machine Learning Repository, "Thyroid Disease Detection"
Model Creation and Evaluation
Various classification algorithms like XG Boost, Random Forest, Decision Tree, and K-Nearest Neighbors tested.
Random Forest gave best results.
Random Forest was chosen for the final model training and testing.
Model performance evaluated based on accuracy, confusion matrix, classification report and AUC curve.
Model Deployment
The final model is deployed on AWS using Flask framework
Don't forget to Starred the repository.
NOTE:
Due to large size of jupyter notebook i have split the file into 3 parts and stored in Notebooks folder
- EDA
- Model building before feature selection
- Feature selection and model building