- sklift.datasets.datasets.fetch_megafon(data_home=None, dest_subdir=None, download_if_missing=True, return_X_y_t=False)¶
Load and return the MegaFon Uplift Competition dataset (classification).
An uplift modeling dataset containing synthetic data generated by telecom companies, trying to bring them closer to the real case that they encountered.
X_1...X_50: anonymized feature set
treatment_group(str): customer purchasing
Read more in the docs.
data_home (str) – The path to the folder where datasets are stored.
dest_subdir (str) – The name of the folder in which the dataset is stored.
download_if_missing (bool) – Download the data if not present. Raises an IOError if False and data is missing.
return_X_y_t (bool) – If True, returns (data, target, treatment) instead of a Bunch object.
By default dictionary-like object, with the following attributes:
data(DataFrame object): Dataset without target and treatment.
target(Series object): Column target by values.
treatment(Series object): Column treatment by values.
DESCR(str): Description of the Megafon dataset.
feature_names(list): Names of the features.
target_name(str): Name of the target.
treatment_name(str): Name of the treatment.
tuple (data, target, treatment) if return_X_y is True
- Return type
Bunch or tuple
from sklift.datasets import fetch_megafon dataset = fetch_megafon() data, target, treatment = dataset.data, dataset.target, dataset.treatment # alternative option data, target, treatment = fetch_megafon(return_X_y_t=True)
fetch_lenta(): Load and return the Lenta dataset (classification).
fetch_x5(): Load and return the X5 RetailHero dataset (classification).
fetch_criteo(): Load and return the Criteo Uplift Prediction Dataset (classification).
fetch_hillstrom(): Load and return Kevin Hillstrom Dataset MineThatData (classification or regression).
MegaFon Uplift Competition Dataset¶
The dataset is provided by MegaFon at the MegaFon Uplift Competition hosted in may 2021.
The dataset contains generated synthetic data, trying to bring them closer to the real case that they encountered.
X_1…X_50: anonymized feature set
treatment_group (str): treatment/control group flag
conversion (binary): customer purchasing
Response Ratio: .2
Treatment Ratio: .5