scikit-uplift (sklift) is an uplift modeling python package that provides fast sklearn-style models implementation, evaluation metrics and visualization tools.
The main idea is to provide easy-to-use and fast python package for uplift modeling. It delivers the model interface with the familiar scikit-learn API. One can use any popular estimator (for instance, from the Catboost library).
Uplift modeling estimates a causal effect of treatment and uses it to effectively target customers that are most likely to respond to a marketing campaign.
Use cases for uplift modeling:
Target customers in the marketing campaign. Quite useful in promotion of some popular product where there is a big part of customers who make a target action by themself without any influence. By modeling uplift you can find customers who are likely to make the target action (for instance, install an app) only when treated (for instance, received a push).
Combine a churn model and an uplift model to offer some bonus to a group of customers who are likely to churn.
Select a tiny group of customers in the campaign where a price per customer is high.
Read more about uplift modeling problem in the User Guide.
Сomfortable and intuitive scikit-learn-like API;
More uplift metrics than you have ever seen in one place! Include brilliants like Area Under Uplift Curve (AUUC) or Area Under Qini Curve (Qini coefficient) with ideal cases;
Supporting any estimator compatible with scikit-learn (e.g. Xgboost, LightGBM, Catboost, etc.);
All approaches can be used in the
sklearn.pipeline. See the example of usage on the Tutorials page;
Also metrics are compatible with the classes from
sklearn.model_selection. See the example of usage on the Tutorials page;
Almost all implemented approaches solve classification and regression problems;
Nice and useful viz for analysing a performance model.
The package currently supports the following methods:
Solo Model (aka S-learner or Treatment Dummy, Treatment interaction) approach
Class Transformation (aka Class Variable Transformation or Revert Label) approach
Two Models (aka X-learner, or naïve approach, or difference score method, or double classifier approach) approach, including Dependent Data Representation
And the following metrics:
Area Under Uplift Curve
Area Under Qini Curve
Weighted average uplift
GitHub repository: https://github.com/maks-sh/scikit-uplift
Github examples: https://github.com/maks-sh/scikit-uplift/tree/master/notebooks
Contributing guide: https://www.uplift-modeling.com/en/latest/contributing.html
Sklift is being actively maintained and welcomes new contributors of all experience levels.
Please see our Contributing Guide for more details.
By participating in this project, you agree to abide by its Code of Conduct.
If you have any questions, please contact us at email@example.com
- Quick Start
- User Guide
- API sklift
- Contributing to scikit-uplift
- Release History
- Hall of Fame
Papers and materials¶
- Gutierrez, P., & Gérardy, J. Y.
Causal Inference and Uplift Modelling: A Review of the Literature. In International Conference on Predictive Applications and APIs (pp. 1-13).
- Artem Betlei, Criteo Research; Eustache Diemert, Criteo Research; Massih-Reza Amini, Univ. Grenoble Alpes
Dependent and Shared Data Representations improve Uplift Prediction in Imbalanced Treatment Conditions FAIM’18 Workshop on CausalML.
- Eustache Diemert, Artem Betlei, Christophe Renaudin, and Massih-Reza Amini. 2018.
A Large Scale Benchmark for Uplift Modeling. In Proceedings of AdKDD & TargetAd (ADKDD’18). ACM, New York, NY, USA, 6 pages.
- Athey, Susan, and Imbens, Guido. 2015.
Machine learning methods for estimating heterogeneous causal effects. Preprint, arXiv:1504.01132. Google Scholar.
- Oscar Mesalles Naranjo. 2012.
Testing a New Metric for Uplift Models. Dissertation Presented for the Degree of MSc in Statistics and Operational Research.
- Kane, K., V. S. Y. Lo, and J. Zheng. 2014.
Mining for the Truly Responsive Customers and Prospects Using True-Lift Modeling: Comparison of New and Existing Methods. Journal of Marketing Analytics 2 (4): 218–238.
- Maciej Jaskowski and Szymon Jaroszewicz.
Uplift modeling for clinical trial data. ICML Workshop on Clinical Data Analysis, 2012.
- Lo, Victor. 2002.
The True Lift Model - A Novel Data Mining Approach to Response Modeling in Database Marketing. SIGKDD Explorations. 4. 78-86.
- Zhao, Yan & Fang, Xiao & Simchi-Levi, David. 2017.
Uplift Modeling with Multiple Treatments and General Response Types. 10.1137/1.9781611974973.66.
- Nicholas J Radcliffe. 2007.
Using control groups to target on predicted lift: Building and assessing uplift model. Direct Marketing Analytics Journal, (3):14–21, 2007.
- Devriendt, F., Guns, T., & Verbeke, W. 2020.
Learning to rank for uplift modeling. ArXiv, abs/2002.05897.