Positivity-free Policy Learning with Observational Data

Pan Zhao, Antoine Chambaz, Julie Josse, Shu Yang
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:1918-1926, 2024.

Abstract

Policy learning utilizing observational data is pivotal across various domains, with the objective of learning the optimal treatment assignment policy while adhering to specific constraints such as fairness, budget, and simplicity. This study introduces a novel positivity-free (stochastic) policy learning framework designed to address the challenges posed by the impracticality of the positivity assumption in real-world scenarios. This framework leverages incremental propensity score policies to adjust propensity score values instead of assigning fixed values to treatments. We characterize these incremental propensity score policies and establish identification conditions, employing semiparametric efficiency theory to propose efficient estimators capable of achieving rapid convergence rates, even when integrated with advanced machine learning algorithms. This paper provides a thorough exploration of the theoretical guarantees associated with policy learning and validates the proposed framework’s finite-sample performance through comprehensive numerical experiments, ensuring the identification of causal effects from observational data is both robust and reliable.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-zhao24a, title = { Positivity-free Policy Learning with Observational Data }, author = {Zhao, Pan and Chambaz, Antoine and Josse, Julie and Yang, Shu}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {1918--1926}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/zhao24a/zhao24a.pdf}, url = {https://proceedings.mlr.press/v238/zhao24a.html}, abstract = { Policy learning utilizing observational data is pivotal across various domains, with the objective of learning the optimal treatment assignment policy while adhering to specific constraints such as fairness, budget, and simplicity. This study introduces a novel positivity-free (stochastic) policy learning framework designed to address the challenges posed by the impracticality of the positivity assumption in real-world scenarios. This framework leverages incremental propensity score policies to adjust propensity score values instead of assigning fixed values to treatments. We characterize these incremental propensity score policies and establish identification conditions, employing semiparametric efficiency theory to propose efficient estimators capable of achieving rapid convergence rates, even when integrated with advanced machine learning algorithms. This paper provides a thorough exploration of the theoretical guarantees associated with policy learning and validates the proposed framework’s finite-sample performance through comprehensive numerical experiments, ensuring the identification of causal effects from observational data is both robust and reliable. } }
Endnote
%0 Conference Paper %T Positivity-free Policy Learning with Observational Data %A Pan Zhao %A Antoine Chambaz %A Julie Josse %A Shu Yang %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-zhao24a %I PMLR %P 1918--1926 %U https://proceedings.mlr.press/v238/zhao24a.html %V 238 %X Policy learning utilizing observational data is pivotal across various domains, with the objective of learning the optimal treatment assignment policy while adhering to specific constraints such as fairness, budget, and simplicity. This study introduces a novel positivity-free (stochastic) policy learning framework designed to address the challenges posed by the impracticality of the positivity assumption in real-world scenarios. This framework leverages incremental propensity score policies to adjust propensity score values instead of assigning fixed values to treatments. We characterize these incremental propensity score policies and establish identification conditions, employing semiparametric efficiency theory to propose efficient estimators capable of achieving rapid convergence rates, even when integrated with advanced machine learning algorithms. This paper provides a thorough exploration of the theoretical guarantees associated with policy learning and validates the proposed framework’s finite-sample performance through comprehensive numerical experiments, ensuring the identification of causal effects from observational data is both robust and reliable.
APA
Zhao, P., Chambaz, A., Josse, J. & Yang, S.. (2024). Positivity-free Policy Learning with Observational Data . Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:1918-1926 Available from https://proceedings.mlr.press/v238/zhao24a.html.

Related Material