Coder Social home page Coder Social logo

varunsh20 / predicting-term-deposit-subscriptions Goto Github PK

View Code? Open in Web Editor NEW
0.0 1.0 0.0 4.28 MB

The goal of this project is to build a machine learning model that can predict whether a client will subscribe to a term deposit or not.

License: MIT License

Python 0.39% HTML 2.93% Jupyter Notebook 96.68%
machine-learning exploratory-data-analysis random-forest xgboost logistic-regression flask herokuapp

predicting-term-deposit-subscriptions's Introduction

Predicting-term-deposit-subscriptions

The goal of this project is to build a machine learning model that can predict whether a client will subscribe to a term deposit or not. This will predict which clients will likely subscribe the term deposit before agents make phone calls.It will also help the banks to decide how many clients needs to be contacted in order to meet their business target,based on which banks can plan the scope,budget and resources of marketing campaign accordingly. The dataset used in this project is obtained from kaggle.

Here is the link to the dataset: https://www.kaggle.com/janiobachmann/bank-marketing-dataset

The dataset includes details of about 11162 customers.

The features used in the data are as follows:

bank client data:

  • age (numeric)
  • job : type of job (categorical: 'admin.','blue-collar','entrepreneur','housemaid','management','retired','self-employed','services','student','technician','unemployed','unknown')
  • marital : marital status (categorical: 'divorced','married','single','unknown'; note: 'divorced' means divorced or widowed)
  • education (categorical: 'basic.4y','basic.6y','basic.9y','high.school','illiterate','professional.course','university.degree','unknown')
  • default: has credit in default? (categorical: 'no','yes','unknown')
  • housing: has housing loan? (categorical: 'no','yes','unknown')
  • loan: has personal loan? (categorical: 'no','yes','unknown')

related with the last contact of the current campaign:

  • contact: contact communication type (categorical: 'cellular','telephone')
  • month: last contact month of year (categorical: 'jan', 'feb', 'mar', ..., 'nov', 'dec')
  • day_of_week: last contact day of the week (categorical: 'mon','tue','wed','thu','fri')
  • duration: last contact duration, in seconds (numeric). Important note: this attribute highly affects the output target (e.g., if duration=0 then y='no').

other attributes:

  • campaign: number of contacts performed during this campaign and for this client (numeric, includes last contact)
  • pdays: number of days that passed by after the client was last contacted from a previous campaign (numeric; 999 means client was not previously contacted)
  • previous: number of contacts performed before this campaign and for this client (numeric)
  • poutcome: outcome of the previous marketing campaign (categorical: 'failure','nonexistent','success') Output variable (desired target):
  • has the client subscribed a term deposit? (binary: 'yes','no')

Exploratory data analysis

Exploring categorical features

b1

From the above plots we can conclude that:

  • In the dataset most of the clients have management as their job type.
  • Clients who are married have more number of records in the given dataset.
  • Most of the clients have secondary education.
  • Default does not seems to be an important feature so it can be dropped.

Exploring numerical features

eda3

From the above plots we can conclude that:

  • Features like age and day are distributed normally.
  • Features like balance,duration,campaign,pdays and previous are highly skewed towards left and seems to have some outliers.

Finding Outliers in numerical features

eda4

It can be seen that age,balance,duration,pdays,previous have some outliers.

eda5

It can be seen that no feature is heavily correlated with any other feature.

Performance by different models:

Random forest classifier

Random forest classifier gives 0.85 accuracy score and 0.85 precision score Confusion matrix:

eda7

SVM's

SVM classifier gives 0.72 accuracy score and 0.74 precision score.

Confusion matrix:

eda8

XGBoost classifier

XGBoost classifier gives 0.85 accuracy score and 0.85 precision score

Confusion matrix:

eda6

A web app of this model is also created using flask and it is deployed on heroku.

You can view the web application here.

Screenshots of project: Screenshot (129) Screenshot (130) Screenshot (131) Screenshot (132) Screenshot (133)

predicting-term-deposit-subscriptions's People

Contributors

varunsh20 avatar

Watchers

 avatar

predicting-term-deposit-subscriptions's Issues

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.