Coder Social home page Coder Social logo

yacine-ammi / mobile-ccp Goto Github PK

View Code? Open in Web Editor NEW
2.0 2.0 0.0 25.3 MB

This repository is created to share the steps that were taken in making my Graduation Thesis for my Applied Statistics Diploma, the project is about creating a machine learning model to predict the churn in a telecom company, this repository includes: The dataset used in the project, the relevant code, and the theisis in pdf.

Jupyter Notebook 100.00%
churn-prediction churn churn-analysis machine-learning xgboost rfm-analysis clustering classification optuna shap telecom telecommunication data-science data-mining

mobile-ccp's Introduction

Project Overview

Objective

The objective of this project is to analyze a public dataset of customers from a telecom company and predict whether a customer will switch to another company, thereby increasing profitability.

Dataset Description

  • This project utilizes a public dataset of 66,469 customers from an anonymous telecommunications company.
  • The goal of the project is to predict customer churn and increase profitability for the company.
  • Data preprocessing and cleaning techniques were used on 66 features before moving to the modelling phase.

Methodology

This project utilized both univariate and multivariate analysis to extract critical insights based on visualizations and correlation matrices. A data preprocessing stage was then required before the modeling phase could begin, where various techniques were used to prepare for the modeling stage. Various classification models were then applied to determine which one performed the best in identifying customers who may churn. The XGBoost model was chosen, evaluated, and interpreted using the SHAP package.

Modelling

  • Univariate and multivariate analysis were used to extract insights from the dataset.
  • Several classification models were tested, including ensemble learning methods like XGBoost.
  • The XGBoost model outperformed the other models with an accuracy of 90%.
  • Hyperparameter optimization was used to improve recall and meet business needs.

Results

  • The XGBClassifier model is capable of effectively handling both churners and non-churners.
  • User spendings was found to be the most relevant feature in predicting customer churn.
  • SHAP values were used to interpret the model and examine feature influence.

Project Structure

  • data: contains the raw dataset.
  • data dictionaey: table in pdf.
  • notebook: contains Jupyter notebooks for data preprocessing, EDA, and modelling.

Conclusion

This study concluded that ensemble learning algorithms, such as XGBoost, performed the best in predicting customer churn with an accuracy of 90%. The XGBClassifier model is capable of handling both classes effectively in identifying churners and non-churners. The most relevant feature in predicting customer churn is user_spendings. SHAP values were used to interpret the model and examine feature influence, providing a better understanding of the model's behavior. Additionally, we were able to examine and determine the influence of each feature on churn prediction both globally on the whole dataset and individually on two random customers.

mobile-ccp's People

Contributors

yacine-ammi avatar

Stargazers

 avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.