sklift.metrics.uplift_by_percentile¶

sklift.metrics.metrics.
uplift_by_percentile
(y_true, uplift, treatment, strategy='overall', bins=10, std=False, total=False)[source]¶ Compute metrics: uplift, group size, group response rate, standard deviation at each percentile.
Metrics in columns and percentiles in rows of pandas DataFrame:
n_treatment
,n_control
 group sizes.response_rate_treatment
,response_rate_control
 group response rates.uplift
 treatment response rate substract control response rate.std_treatment
,std_control
 (optional) response rates standard deviation.std_uplift
 (optional) uplift standard deviation.
Parameters:  y_true (1d arraylike) – Correct (true) target values.
 uplift (1d arraylike) – Predicted uplift, as returned by a model.
 treatment (1d arraylike) – Treatment labels.
 strategy (string, ['overall', 'by_group']) –
Determines the calculating strategy. Default is ‘overall’.
'overall'
: The first step is taking the first k observations of all test data ordered by uplift prediction (overall both groups  control and treatment) and conversions in treatment and control groups calculated only on them. Then the difference between these conversions is calculated.
'by_group'
: Separately calculates conversions in top k observations in each group (control and treatment) sorted by uplift predictions. Then the difference between these conversions is calculated
 std (bool) – If True, add columns with the uplift standard deviation and the response rate standard deviation. Default is False.
 total (bool) – If True, add the last row with the total values. Default is False.
The total uplift is a weighted average uplift. See
weighted_average_uplift()
. The total response rate is a response rate on the full data amount.  bins (int) – Determines the number of bins (and the relative percentile) in the data. Default is 10.
Returns: DataFrame where metrics are by columns and percentiles are by rows.
Return type: pandas.DataFrame