Calibrated and Conformal Propensity Scores for Causal Effect Estimation

Shachi Deshpande, Volodymyr Kuleshov
Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, PMLR 244:1083-1111, 2024.

Abstract

Propensity scores are commonly used to balance observed covariates while estimating treatment effects. We argue that the probabilistic output of a learned propensity score model should be calibrated, i.e. a predictive treatment probability of 90% should correspond to 90% of individuals being assigned the treatment group. We propose simple recalibration techniques to ensure this property. We prove that calibration is a necessary condition for unbiased treatment effect estimation when using popular inverse propensity weighted and doubly robust estimators. We derive error bounds on causal effect estimates that directly relate to the quality of uncertainties provided by the probabilistic propensity score model and show that calibration strictly improves this error bound while also avoiding extreme propensity weights. We demonstrate improved causal effect estimation with calibrated propensity scores in several tasks including high-dimensional image covariates and genome-wide association studies (GWASs). Calibrated propensity scores improve the speed of GWAS analysis by more than two-fold by enabling the use of simpler models that are faster to train.

Cite this Paper


BibTeX
@InProceedings{pmlr-v244-deshpande24a, title = {Calibrated and Conformal Propensity Scores for Causal Effect Estimation}, author = {Deshpande, Shachi and Kuleshov, Volodymyr}, booktitle = {Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence}, pages = {1083--1111}, year = {2024}, editor = {Kiyavash, Negar and Mooij, Joris M.}, volume = {244}, series = {Proceedings of Machine Learning Research}, month = {15--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v244/main/assets/deshpande24a/deshpande24a.pdf}, url = {https://proceedings.mlr.press/v244/deshpande24a.html}, abstract = {Propensity scores are commonly used to balance observed covariates while estimating treatment effects. We argue that the probabilistic output of a learned propensity score model should be calibrated, i.e. a predictive treatment probability of 90% should correspond to 90% of individuals being assigned the treatment group. We propose simple recalibration techniques to ensure this property. We prove that calibration is a necessary condition for unbiased treatment effect estimation when using popular inverse propensity weighted and doubly robust estimators. We derive error bounds on causal effect estimates that directly relate to the quality of uncertainties provided by the probabilistic propensity score model and show that calibration strictly improves this error bound while also avoiding extreme propensity weights. We demonstrate improved causal effect estimation with calibrated propensity scores in several tasks including high-dimensional image covariates and genome-wide association studies (GWASs). Calibrated propensity scores improve the speed of GWAS analysis by more than two-fold by enabling the use of simpler models that are faster to train.} }
Endnote
%0 Conference Paper %T Calibrated and Conformal Propensity Scores for Causal Effect Estimation %A Shachi Deshpande %A Volodymyr Kuleshov %B Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2024 %E Negar Kiyavash %E Joris M. Mooij %F pmlr-v244-deshpande24a %I PMLR %P 1083--1111 %U https://proceedings.mlr.press/v244/deshpande24a.html %V 244 %X Propensity scores are commonly used to balance observed covariates while estimating treatment effects. We argue that the probabilistic output of a learned propensity score model should be calibrated, i.e. a predictive treatment probability of 90% should correspond to 90% of individuals being assigned the treatment group. We propose simple recalibration techniques to ensure this property. We prove that calibration is a necessary condition for unbiased treatment effect estimation when using popular inverse propensity weighted and doubly robust estimators. We derive error bounds on causal effect estimates that directly relate to the quality of uncertainties provided by the probabilistic propensity score model and show that calibration strictly improves this error bound while also avoiding extreme propensity weights. We demonstrate improved causal effect estimation with calibrated propensity scores in several tasks including high-dimensional image covariates and genome-wide association studies (GWASs). Calibrated propensity scores improve the speed of GWAS analysis by more than two-fold by enabling the use of simpler models that are faster to train.
APA
Deshpande, S. & Kuleshov, V.. (2024). Calibrated and Conformal Propensity Scores for Causal Effect Estimation. Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 244:1083-1111 Available from https://proceedings.mlr.press/v244/deshpande24a.html.

Related Material