Incentivizing Compliance with Algorithmic Instruments

Dung Daniel T Ngo, Logan Stapleton, Vasilis Syrgkanis, Steven Wu
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:8045-8055, 2021.

Abstract

Randomized experiments can be susceptible to selection bias due to potential non-compliance by the participants. While much of the existing work has studied compliance as a static behavior, we propose a game-theoretic model to study compliance as dynamic behavior that may change over time. In rounds, a social planner interacts with a sequence of heterogeneous agents who arrive with their unobserved private type that determines both their prior preferences across the actions (e.g., control and treatment) and their baseline rewards without taking any treatment. The planner provides each agent with a randomized recommendation that may alter their beliefs and their action selection. We develop a novel recommendation mechanism that views the planner’s recommendation as a form of instrumental variable (IV) that only affects an agents’ action selection, but not the observed rewards. We construct such IVs by carefully mapping the history –the interactions between the planner and the previous agents– to a random recommendation. Even though the initial agents may be completely non-compliant, our mechanism can incentivize compliance over time, thereby enabling the estimation of the treatment effect of each treatment, and minimizing the cumulative regret of the planner whose goal is to identify the optimal treatment.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-ngo21a, title = {Incentivizing Compliance with Algorithmic Instruments}, author = {Ngo, Dung Daniel T and Stapleton, Logan and Syrgkanis, Vasilis and Wu, Steven}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {8045--8055}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/ngo21a/ngo21a.pdf}, url = {https://proceedings.mlr.press/v139/ngo21a.html}, abstract = {Randomized experiments can be susceptible to selection bias due to potential non-compliance by the participants. While much of the existing work has studied compliance as a static behavior, we propose a game-theoretic model to study compliance as dynamic behavior that may change over time. In rounds, a social planner interacts with a sequence of heterogeneous agents who arrive with their unobserved private type that determines both their prior preferences across the actions (e.g., control and treatment) and their baseline rewards without taking any treatment. The planner provides each agent with a randomized recommendation that may alter their beliefs and their action selection. We develop a novel recommendation mechanism that views the planner’s recommendation as a form of instrumental variable (IV) that only affects an agents’ action selection, but not the observed rewards. We construct such IVs by carefully mapping the history –the interactions between the planner and the previous agents– to a random recommendation. Even though the initial agents may be completely non-compliant, our mechanism can incentivize compliance over time, thereby enabling the estimation of the treatment effect of each treatment, and minimizing the cumulative regret of the planner whose goal is to identify the optimal treatment.} }
Endnote
%0 Conference Paper %T Incentivizing Compliance with Algorithmic Instruments %A Dung Daniel T Ngo %A Logan Stapleton %A Vasilis Syrgkanis %A Steven Wu %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-ngo21a %I PMLR %P 8045--8055 %U https://proceedings.mlr.press/v139/ngo21a.html %V 139 %X Randomized experiments can be susceptible to selection bias due to potential non-compliance by the participants. While much of the existing work has studied compliance as a static behavior, we propose a game-theoretic model to study compliance as dynamic behavior that may change over time. In rounds, a social planner interacts with a sequence of heterogeneous agents who arrive with their unobserved private type that determines both their prior preferences across the actions (e.g., control and treatment) and their baseline rewards without taking any treatment. The planner provides each agent with a randomized recommendation that may alter their beliefs and their action selection. We develop a novel recommendation mechanism that views the planner’s recommendation as a form of instrumental variable (IV) that only affects an agents’ action selection, but not the observed rewards. We construct such IVs by carefully mapping the history –the interactions between the planner and the previous agents– to a random recommendation. Even though the initial agents may be completely non-compliant, our mechanism can incentivize compliance over time, thereby enabling the estimation of the treatment effect of each treatment, and minimizing the cumulative regret of the planner whose goal is to identify the optimal treatment.
APA
Ngo, D.D.T., Stapleton, L., Syrgkanis, V. & Wu, S.. (2021). Incentivizing Compliance with Algorithmic Instruments. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:8045-8055 Available from https://proceedings.mlr.press/v139/ngo21a.html.

Related Material