Coder Social home page Coder Social logo

capstone-project's Introduction

Capstone-Project

Education Success Rates based on Various factors

This is a Machine Learning Model which predicts whether a student would achieve succes (graduate or drop out) in a learning environment.

This project will explore the significance that specific demographic, geographic, psychographic, and financial factors have on a student’s academic performance by looking at data sources that show the academic performance of several students in relation to the mentioned factors. This project will use data analysis, data cleaning, and predictive models to determine how a student will do based on these factors.


Author: Naiema Elsaadi
Date: November 21, 2023
Northwestern Missouri State University, Maryville MO 64468, USA

Table of Contents

Overview

This project focuses on Education. Education is often the most important pillar of success in the world today. Success in the classroom can be a good indicator of future success in employment, as well as be the beginning of a life of financial fulfillment. The data problems that this project hopes to analyze, are the factors that lead to student success in school vs the factors that lead to failure in school. This is significantly important in today’s world, because education is proven to be very important in the future success of most people. This Project will attempt to highlight the main factors that lead to student success and student failures, so that people can be more aware of them. This repository contains the code, data, and documentation for the Education Success Analytics project. The project aims to analyze factors influencing education success rates using a dataset obtained from Kaggle.

Data Sources

The dataset that is used for this project was found through Kaggle, a popular platform for datasets. This dataset will provide information on Student Performance in schools, GPAs, ACTs, Drop out rates, as well as information on financial situation of students, geographic, demographic, and psychographic information on students and their families.
Source Data for csv file:

https://www.kaggle.com/datasets/naveenkumar20bps1137/predict-students-dropout-and-academic-success?select=dataset.csv

Prerequisites

Before running the project, ensure you have the following prerequisites:

  • Git

  • Github

  • Python 3.10+ installed

  • anaconda prompt (miniconda3)

  • jupyter lab

  • The following modules are used in this project:
    -csv
    -pip install -r requirements.txt
    -source .venv/bin/activate
    .venv\Scripts\activate

  • Required Python Importing Libraries
    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    import seaborn as sns
    import matplotlib.cm as cm
    import matplotlib as mpl
    import warnings
    warnings.filterwarnings('ignore')
    from sklearn.model_selection import train_test_split
    from sklearn.ensemble import RandomForestClassifier
    from sklearn.tree import DecisionTreeClassifier
    from sklearn import svm
    from sklearn.ensemble import AdaBoostClassifier
    from sklearn.neural_network import MLPClassifier
    from sklearn.linear_model import LogisticRegression
    from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score, confusion_matrix, classification_report
    from sklearn.metrics import accuracy_score
    from sklearn.metrics import classification_report
    from sklearn.preprocessing import LabelEncoder
    from sklearn.metrics import roc_auc_score, precision_score, recall_score, accuracy_score, f1_score, roc_curve
    from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
    %matplotlib inline

How to Run

Follow these steps to run the project:

  1. Clone the repository to your local machine.
  2. Navigate to the project directory.
  3. Open anaconda prompt (miniconda3) and clone your file by entering cd and pasting the copied path.
  4. Run jupyter lab.
  5. Explore the notebooks, scripts, and data folders for detailed insights into the project.
  6. Refer to the README.md for additional information.
  7. Start the coding for the project.
    – Phase 1: Acquire Data source
    – Phase 2: Data Preprocessing and Cleaning
    – Phase 3: Exploratory Data Analysis
    – Phase 4: Designing Predictive Models
    – Phase 5: Building Predictive Models
    – Phase 6: Model Evaluation
    – Phase 7: Final Model Selection
    – Phase 8: Results and Discussion
    – Phase 9: Conclusion
    – Phase 10: Limitations and Future Work

Overleaf Report

The detailed report for this project is available on Overleaf. You can access it and view it here.

Key Computational Resources

Project Structure

The project structure is organized as follows:

  • data : Contains the dataset used in the project (data.csv).
  • notebooks: Jupyter notebooks for data analysis and modeling.
  • scripts: Python scripts for specific tasks.
  • 1_Exploratory_Data_Analysis.ipynb: Notebook for exploratory data analysis.
  • 2_Data_Cleaning_and_Feature_Engineering.ipynb: Notebook for data cleaning and feature engineering.
  • 3_Modeling.ipynb: Notebook for applying machine learning models.
  • 4_model_training.ipynb: Script for training machine learning models.
  • 5_Visualizing.ipynb: Notebook for visualizing the machine learning.
  • Final_Project.ipynb: Notebook for applying machine learning to the project.
  • Reports: Overleaf Report
  • text_workbooks.txt: text workbooks used for data exploration and analysis.
  • photos.png: Screenshots for the results
  • README.md: The main documentation file that introduces the project, explains the structure, and provides relevant links.

Screenshots

How It Works

Sample Image

Output and Results

Results for all of the models
My Screenshot

1. Extract and display feature importances and visualize feature importances with a bar plot

My Screenshot

2. Correlation of each feature My Screenshot

3. Visualizing the combined ROC curves My Screenshot

Conclusion

This README.md serves as a guide to navigate through the project. For detailed analysis, findings, and visualizations, please refer to the Overleaf report linked above.

Feel free to reach out for any questions or clarifications.

My name is Naiema Elsaadi
My email address is [email protected]

Reference

  1. Horton, J.: Identifying at-risk factors that affect college student success. International Journal of Process Education 7(1), 83–101 (2015)
  2. Kuh, G.D., Kinzie, J.L., Buckley, J.A., Bridges, B.K., Hayek, J.C.: What matters to student success: A review of the literature, vol. 8. National Postsecondary Education Cooperative Washington, DC (2006)
  3. Kumar, K.N.: Predict students’ dropout, academic success. https://www.kaggle.com/datasets/naveenkumar20bps1137/ predict-students-dropout-and-academic-success?select=dataset.csv (2021), (accessed: 2023-10-22)
  4. Shah, N.S.: Predicting factors that affect students’ academic performance by using data mining techniques. Pakistan business review 13(4), 631–638 (2012)

capstone-project's People

Contributors

naiemaelsaadi avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.