Model updating after interventions paradoxically introduces bias

James Liley, Samuel Emerson, Bilal Mateen, Catalina Vallejos, Louis Aslett, Sebastian Vollmer
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:3916-3924, 2021.

Abstract

Machine learning is increasingly being used to generate prediction models for use in a number of real-world settings, from credit risk assessment to clinical decision support. Recent discussions have highlighted potential problems in the updating of a predictive score for a binary outcome when an existing predictive score forms part of the standard workflow, driving interventions. In this setting, the existing score induces an additional causative pathway which leads to miscalibration when the original score is replaced. We propose a general causal framework to describe and address this problem, and demonstrate an equivalent formulation as a partially observed Markov decision process. We use this model to demonstrate the impact of such ‘naive updating’ when performed repeatedly. Namely, we show that successive predictive scores may converge to a point where they predict their own effect, or may eventually tend toward a stable oscillation between two values, and we argue that neither outcome is desirable. Furthermore, we demonstrate that even if model-fitting procedures improve, actual performance may worsen. We complement these findings with a discussion of several potential routes to overcome these issues.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-liley21a, title = { Model updating after interventions paradoxically introduces bias }, author = {Liley, James and Emerson, Samuel and Mateen, Bilal and Vallejos, Catalina and Aslett, Louis and Vollmer, Sebastian}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {3916--3924}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/liley21a/liley21a.pdf}, url = {https://proceedings.mlr.press/v130/liley21a.html}, abstract = { Machine learning is increasingly being used to generate prediction models for use in a number of real-world settings, from credit risk assessment to clinical decision support. Recent discussions have highlighted potential problems in the updating of a predictive score for a binary outcome when an existing predictive score forms part of the standard workflow, driving interventions. In this setting, the existing score induces an additional causative pathway which leads to miscalibration when the original score is replaced. We propose a general causal framework to describe and address this problem, and demonstrate an equivalent formulation as a partially observed Markov decision process. We use this model to demonstrate the impact of such ‘naive updating’ when performed repeatedly. Namely, we show that successive predictive scores may converge to a point where they predict their own effect, or may eventually tend toward a stable oscillation between two values, and we argue that neither outcome is desirable. Furthermore, we demonstrate that even if model-fitting procedures improve, actual performance may worsen. We complement these findings with a discussion of several potential routes to overcome these issues. } }
Endnote
%0 Conference Paper %T Model updating after interventions paradoxically introduces bias %A James Liley %A Samuel Emerson %A Bilal Mateen %A Catalina Vallejos %A Louis Aslett %A Sebastian Vollmer %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-liley21a %I PMLR %P 3916--3924 %U https://proceedings.mlr.press/v130/liley21a.html %V 130 %X Machine learning is increasingly being used to generate prediction models for use in a number of real-world settings, from credit risk assessment to clinical decision support. Recent discussions have highlighted potential problems in the updating of a predictive score for a binary outcome when an existing predictive score forms part of the standard workflow, driving interventions. In this setting, the existing score induces an additional causative pathway which leads to miscalibration when the original score is replaced. We propose a general causal framework to describe and address this problem, and demonstrate an equivalent formulation as a partially observed Markov decision process. We use this model to demonstrate the impact of such ‘naive updating’ when performed repeatedly. Namely, we show that successive predictive scores may converge to a point where they predict their own effect, or may eventually tend toward a stable oscillation between two values, and we argue that neither outcome is desirable. Furthermore, we demonstrate that even if model-fitting procedures improve, actual performance may worsen. We complement these findings with a discussion of several potential routes to overcome these issues.
APA
Liley, J., Emerson, S., Mateen, B., Vallejos, C., Aslett, L. & Vollmer, S.. (2021). Model updating after interventions paradoxically introduces bias . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:3916-3924 Available from https://proceedings.mlr.press/v130/liley21a.html.

Related Material