Uplift Modeling with High Class Imbalance

Otto Nyberg, Tomasz Kuśmierczyk, Arto Klami
Proceedings of The 13th Asian Conference on Machine Learning, PMLR 157:315-330, 2021.

Abstract

Uplift modeling refers to estimating the causal effect of a treatment on an individual observation, used for instance to identify customers worth targeting with a discount in e-commerce. We introduce a simple yet effective undersampling strategy for dealing with the prevalent problem of high class imbalance (low conversion rate) in such applications. Our strategy is agnostic to the base learners and produces a 6.5% improvement over the best published benchmark for the largest public uplift data which incidentally exhibits high class imbalance. We also introduce a new metric on calibration for uplift modeling and present a strategy to improve the calibration of the proposed method.

Cite this Paper


BibTeX
@InProceedings{pmlr-v157-nyberg21a, title = {Uplift Modeling with High Class Imbalance}, author = {Nyberg, Otto and Ku\'smierczyk, Tomasz and Klami, Arto}, booktitle = {Proceedings of The 13th Asian Conference on Machine Learning}, pages = {315--330}, year = {2021}, editor = {Balasubramanian, Vineeth N. and Tsang, Ivor}, volume = {157}, series = {Proceedings of Machine Learning Research}, month = {17--19 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v157/nyberg21a/nyberg21a.pdf}, url = {https://proceedings.mlr.press/v157/nyberg21a.html}, abstract = {Uplift modeling refers to estimating the causal effect of a treatment on an individual observation, used for instance to identify customers worth targeting with a discount in e-commerce. We introduce a simple yet effective undersampling strategy for dealing with the prevalent problem of high class imbalance (low conversion rate) in such applications. Our strategy is agnostic to the base learners and produces a 6.5% improvement over the best published benchmark for the largest public uplift data which incidentally exhibits high class imbalance. We also introduce a new metric on calibration for uplift modeling and present a strategy to improve the calibration of the proposed method.} }
Endnote
%0 Conference Paper %T Uplift Modeling with High Class Imbalance %A Otto Nyberg %A Tomasz Kuśmierczyk %A Arto Klami %B Proceedings of The 13th Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Vineeth N. Balasubramanian %E Ivor Tsang %F pmlr-v157-nyberg21a %I PMLR %P 315--330 %U https://proceedings.mlr.press/v157/nyberg21a.html %V 157 %X Uplift modeling refers to estimating the causal effect of a treatment on an individual observation, used for instance to identify customers worth targeting with a discount in e-commerce. We introduce a simple yet effective undersampling strategy for dealing with the prevalent problem of high class imbalance (low conversion rate) in such applications. Our strategy is agnostic to the base learners and produces a 6.5% improvement over the best published benchmark for the largest public uplift data which incidentally exhibits high class imbalance. We also introduce a new metric on calibration for uplift modeling and present a strategy to improve the calibration of the proposed method.
APA
Nyberg, O., Kuśmierczyk, T. & Klami, A.. (2021). Uplift Modeling with High Class Imbalance. Proceedings of The 13th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 157:315-330 Available from https://proceedings.mlr.press/v157/nyberg21a.html.

Related Material