Coder Social home page Coder Social logo

tang-li-jen / tbrain-insurance-policy-renewals Goto Github PK

View Code? Open in Web Editor NEW

This project forked from ceshine/tbrain-insurance-policy-renewals

0.0 2.0 0.0 1.2 MB

Top 1 solution to the TBrain - 客戶續約金額預測 machine learning competition.

Python 3.61% HTML 53.11% Jupyter Notebook 43.28%

tbrain-insurance-policy-renewals's Introduction

Predicting Insurance Policy Renewals

Top 1 solution to the TBrain - 客戶續約金額預測 machine learning competition.

This is the exact same code used to generate the best submission.

Solution Documentation (in traditional Chinese)

EDA (Exploratory Data Analysis) Notebook

Located inside the notebooks folder.

Quick links:

Model Training and Submission Generation

Potential Compatibility Issue: I use fish shell. If you're using Bash, you might need to change set -x SEED num; to SEED=num .

1st Layer

Train Full LGBM Regression Model

set -x SEED 9989; python simple_lgb.py

Train LGBM Regression Models with Limited Features

(The last parameter represents the number of best features to be used in training)

set -x SEED 11511; python truncate_lgb.py 50
set -x SEED 13511; python truncate_lgb.py 40
set -x SEED 12511; python truncate_lgb.py 60

Train Full LGBM Classification Model

set -x SEED 1989; python lgb_binary_target.py

Train LGBM Classification Models with Limited Features

(The last parameter represents the number of best features to be used in training)

set -x SEED 22511; python lgb_binary_truncate.py 50
set -x SEED 23513; python lgb_binary_truncate.py 40
set -x SEED 52511; python lgb_binary_truncate.py 60

2nd Layer

Train LGB Ensembles

set -x SEED 51; python ensemble_lgb.py 50
set -x SEED 15; python ensemble_lgb.py 50

Train DNN Ensembles

set -x SEED 515; python ensemble_dnn.py
set -x SEED 151; python ensemble_dnn.py

3rd Layer - DNN Ensemble

set -x SEED 1515; python ensemble_dnn_2nd.py
set -x SEED 5151; python ensemble_dnn_2nd.py

Create Final Submission By Averaging

python avg_files.py

The final submission file (prediction to the test dataset) can be found in the project root folder with file name sub_ens.csv.

Note: The final model has a CV loss of 1657.81, which is slightly higher than the one of the best submission (1656.24). This is because the numbers of models in the 1st and 2nd layers are reduced to save training time.

tbrain-insurance-policy-renewals's People

Contributors

ceshine avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.