Coder Social home page Coder Social logo

kiminh / scikit-uplift Goto Github PK

View Code? Open in Web Editor NEW

This project forked from maks-sh/scikit-uplift

0.0 1.0 0.0 7.07 MB

:exclamation: uplift modeling in scikit-learn style in python :snake:

Home Page: https://www.uplift-modeling.com

License: MIT License

Python 100.00%

scikit-uplift's Introduction

Python3_ PyPi_ Docs_ License_

scikit-uplift: uplift modeling in scikit-learn style in python

scikit-uplift

scikit-uplift (sklift) is an uplift modeling python package that provides fast sklearn-style models implementation, evaluation metrics and visualization tools.

Uplift modeling estimates a causal effect of treatment and uses it to effectively target customers that are most likely to respond to a marketing campaign.

Use cases for uplift modeling:

  • Target customers in the marketing campaign. Quite useful in promotion of some popular product where there is a big part of customers who make a target action by themself without any influence. By modeling uplift you can find customers who are likely to make the target action (for instance, install an app) only when treated (for instance, received a push).
  • Combine a churn model and an uplift model to offer some bonus to a group of customers who are likely to churn.
  • Select a tiny group of customers in the campaign where a price per customer is high.

Read more about uplift modeling problem in User Guide.

Articles in russian on habr.com: Part 1 , Part 2 and Part 3.

Features:

  • Сomfortable and intuitive scikit-learn-like API;
  • Applying any estimator compatible with scikit-learn (e.g. Xgboost, LightGBM, Catboost, etc.);
  • All approaches can be used in sklearn.pipeline (see example (EN Open In Colab3_, RU Open In Colab4_));
  • Almost all implemented approaches solve classification and regression problem;
  • More uplift metrics that you have ever seen in one place! Include brilliants like Area Under Uplift Curve (AUUC) or Area Under Qini Curve (Qini coefficient) with ideal cases;
  • Nice and useful viz for analyzing a performance model.

Installation

Install the package by the following command from PyPI:

pip install scikit-uplift

Or install from source:

git clone https://github.com/maks-sh/scikit-uplift.git
cd scikit-uplift
python setup.py install

Documentation

The full documentation is available at uplift-modeling.com.

Or you can build the documentation locally using Sphinx 1.4 or later:

cd docs
pip install -r requirements.txt
make html

And if you now point your browser to _build/html/index.html, you should see a documentation site.

Quick Start

See the RetailHero tutorial notebook (EN Open In Colab1_, RU Open In Colab2_) for details.

Train and predict uplift model

Use the intuitive python API to train uplift models with sklift.models.

# import approaches
from sklift.models import SoloModel, ClassTransformation, TwoModels
# import any estimator adheres to scikit-learn conventions.
from catboost import CatBoostClassifier


# define models
treatment_model = CatBoostClassifier(iterations=50, thread_count=3,
                                     random_state=42, silent=True)
control_model = CatBoostClassifier(iterations=50, thread_count=3,
                                   random_state=42, silent=True)

# define approach
tm = TwoModels(treatment_model, control_model, method='vanilla')
# fit model
tm = tm.fit(X_train, y_train, treat_train)

# predict uplift
uplift_preds = tm.predict(X_val)

Evaluate your uplift model

Uplift model evaluation metrics are available in sklift.metrics.

# import metrics to evaluate your model
from sklift.metrics import (
    uplift_at_k, uplift_auc_score, qini_auc_score, weighted_average_uplift
)


# Uplift@30%
tm_uplift_at_k = uplift_at_k(y_true=y_val, uplift=uplift_preds, treatment=treat_val,
                             strategy='overall', k=0.3)

# Area Under Qini Curve
tm_qini_auc = qini_auc_score(y_true=y_val, uplift=uplift_preds, treatment=treat_val)

# Area Under Uplift Curve
tm_uplift_auc = uplift_auc_score(y_true=y_val, uplift=uplift_preds, treatment=treat_val)

# Weighted average uplift
tm_wau = weighted_average_uplift(y_true=y_val, uplift=uplift_preds,  treatment=treat_val)

Vizualize the results

Visualize performance metrics with sklift.viz.

# import vizualisation tools
from sklift.viz import plot_qini_curve

plot_qini_curve(y_true=y_val, uplift=uplift_preds, treatment=treat_val, negative_effect=True)

Example of model's qini curve, perfect qini curve and random qini curve

Development

We welcome new contributors of all experience levels.

If you have any questions, please contact us at [email protected]


Papers and materials

  1. Gutierrez, P., & Gérardy, J. Y.

    Causal Inference and Uplift Modelling: A Review of the Literature. In International Conference on Predictive Applications and APIs (pp. 1-13).

  2. Artem Betlei, Criteo Research; Eustache Diemert, Criteo Research; Massih-Reza Amini, Univ. Grenoble Alpes

    Dependent and Shared Data Representations improve Uplift Prediction in Imbalanced Treatment Conditions FAIM'18 Workshop on CausalML.

  3. Eustache Diemert, Artem Betlei, Christophe Renaudin, and Massih-Reza Amini. 2018.

    A Large Scale Benchmark for Uplift Modeling. In Proceedings of AdKDD & TargetAd (ADKDD’18). ACM, New York, NY, USA, 6 pages.

  4. Athey, Susan, and Imbens, Guido. 2015.

    Machine learning methods for estimating heterogeneous causal effects. Preprint, arXiv:1504.01132. Google Scholar.

  5. Oscar Mesalles Naranjo. 2012.

    Testing a New Metric for Uplift Models. Dissertation Presented for the Degree of MSc in Statistics and Operational Research.

  6. Kane, K., V. S. Y. Lo, and J. Zheng. 2014.

    Mining for the Truly Responsive Customers and Prospects Using True-Lift Modeling: Comparison of New and Existing Methods. Journal of Marketing Analytics 2 (4): 218–238.

  7. Maciej Jaskowski and Szymon Jaroszewicz.

    Uplift modeling for clinical trial data. ICML Workshop on Clinical Data Analysis, 2012.

  8. Lo, Victor. 2002.

    The True Lift Model - A Novel Data Mining Approach to Response Modeling in Database Marketing. SIGKDD Explorations. 4. 78-86.

  9. Zhao, Yan & Fang, Xiao & Simchi-Levi, David. 2017.

    Uplift Modeling with Multiple Treatments and General Response Types. 10.1137/1.9781611974973.66.

  10. Nicholas J Radcliffe. 2007. Using control groups to target on predicted lift: Building and assessing uplift model. Direct Marketing Analytics Journal, (3):14–21, 2007.
  11. Devriendt, F., Guns, T., & Verbeke, W. 2020. Learning to rank for uplift modeling. ArXiv, abs/2002.05897.

Tags

EN: uplift modeling, uplift modelling, causal inference, causal effect, causality, individual treatment effect, true lift, net lift, incremental modeling

RU: аплифт моделирование, Uplift модель

ZH: 隆起建模,因果推断,因果效应,因果关系,个人治疗效应,真正的电梯,净电梯

scikit-uplift's People

Contributors

maks-sh avatar elisovaira avatar flashlight101 avatar acssar avatar robbstarkk avatar tankudo avatar spiaz avatar adivarma27 avatar muhamob avatar bwbelljr avatar dennisliub avatar elmaxuno avatar semenova-pd avatar sidorovtv avatar

Watchers

James Cloos avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.