sklift.metrics.metrics.uplift_by_percentile(y_true, uplift, treatment, strategy='overall', bins=10, std=False, total=False)[source]

Compute metrics: uplift, group size, group response rate, standard deviation at each percentile.

Metrics in columns and percentiles in rows of pandas DataFrame:

  • n_treatment, n_control - group sizes.
  • response_rate_treatment, response_rate_control - group response rates.
  • uplift - treatment response rate substract control response rate.
  • std_treatment, std_control - (optional) response rates standard deviation.
  • std_uplift - (optional) uplift standard deviation.
  • y_true (1d array-like) – Correct (true) target values.
  • uplift (1d array-like) – Predicted uplift, as returned by a model.
  • treatment (1d array-like) – Treatment labels.
  • strategy (string, ['overall', 'by_group']) –

    Determines the calculating strategy. Default is ‘overall’.

    • 'overall':
      The first step is taking the first k observations of all test data ordered by uplift prediction (overall both groups - control and treatment) and conversions in treatment and control groups calculated only on them. Then the difference between these conversions is calculated.
    • 'by_group':
      Separately calculates conversions in top k observations in each group (control and treatment) sorted by uplift predictions. Then the difference between these conversions is calculated
  • std (bool) – If True, add columns with the uplift standard deviation and the response rate standard deviation. Default is False.
  • total (bool) – If True, add the last row with the total values. Default is False. The total uplift is a weighted average uplift. See weighted_average_uplift(). The total response rate is a response rate on the full data amount.
  • bins (int) – Determines the number of bins (and the relative percentile) in the data. Default is 10.

DataFrame where metrics are by columns and percentiles are by rows.

Return type: