Coder Social home page Coder Social logo

employee-churn-prediction's Introduction

employee churn prediction

GSU - Machine Learning Lesson Project

Dataset

HR Analytics Data is published here to resolve employee churn prediction. Dataset is also exist in project under dataset folder, you can access easily.

Environment setup

* Conda (4.4.4)
* Python (3.5.3)
* H2O.ai (3.16.0.2)
* Conda & H2O.ai connection

Conda

Conda is an open source package management system and environment management system that runs on Windows, macOS and Linux.

You can access detailed installation guide from here

Python

Conda supports Python by default even if Python is already installed in your computer.

You can access detailed installation guide from here

H2O.ai

H2O is an open source, in-memory, distributed, fast, and scalable machine learning and predictive analytics platform.

You can find detailed installation guide and download latest H2O release from here

After starting H20 cluster in your machine, H2O will be point your browser to http://localhost:54321 where H2O works.

Conda & H2O.ai connection

Before running employee-churn-prediction.py which is execution file to create randomForestModel, you need to install H2O.ai with conda.

From terminal, please run:

conda install -c h2oai h2o=3.16.0.2

code execution

In order to create RandomForestModel with default parameters, you only need to run employee-churn-prediction.py.However, H20 cluster should be running at localhost and default port 54321 before execution. After execution model performance will be written to logs like:

MSE: 0.015226097527886655
RMSE: 0.12339407411981604
LogLoss: 0.0657476990661764
Mean Per-Class Error: 0.024314773392581035
AUC: 0.9932409589719724
Gini: 0.9864819179439448
Confusion Matrix (Act/Pred) for max f1 @ threshold = 0.34526066333055494: 
       0     1    Error    Rate
-----  ----  ---  -------  -------------
0      2284  13   0.0057   (13.0/2297.0)
1      36    689  0.0497   (36.0/725.0)
Total  2320  702  0.0162   (49.0/3022.0)

metric                       threshold    value     idx
---------------------------  -----------  --------  -----
max f1                       0.345261     0.965662  127
max f2                       0.216413     0.964335  160
max f0point5                 0.678677     0.978929  101
max accuracy                 0.350082     0.983786  125
max precision                1            1         0
max recall                   0.000342523  1         398
max specificity              1            1         0
max absolute_mcc             0.345261     0.955266  127
max min_per_class_accuracy   0.167209     0.971267  178
max mean_per_class_accuracy  0.216413     0.975685  160
Gains/Lift Table: Avg response rate: 23.99 %

In addition to detailed performance result,if you need to see detailed information about dataset and model, you point your browser to http://localhost:54321, you will see H20 Flow GUI. employee-churn-prediction.py removes all H2O datasets before creating models due to prevent data duplication.

employee-churn-prediction's People

Contributors

kadriyedogan avatar

Watchers

James Cloos avatar  avatar

Forkers

piyush-ahuja

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.