Error Amplification When Updating Deployed Machine Learning Models

George Alexandru Adam, Chun-Hao Kingsley Chang, Benjamin Haibe-Kains, Anna Goldenberg
Proceedings of the 7th Machine Learning for Healthcare Conference, PMLR 182:715-740, 2022.

Abstract

As machine learning (ML) shows vast potential in real world applications, the number of deployed models has been increasing substantially, but little attention has been devoted to validating/improving model performance over time. Model updates, sometimes frequent, are essential to dealing with data shift, policy changes and in general, to improving the model performance. Updating also presents significant risks to amplifying model errors if effort is not put into preventing it. Unfortunately very little analysis is done to date of what can happen as models are deployed and become a part of the decision making process, where there is no longer a way to disentangle human from machine error in the collected labels. The phenomenon of interest is termed error amplification where model errors corrupt future labels, and are reinforced by updates eventually causing the model to predict its own outputs instead of the labels of interest. We analyze various factors influencing the magnitude of error amplification, and provide guidance for model and threshold selection when error amplification is a risk. We demonstrate that a variety of learning techniques cannot handle the systematic way in which error amplification corrupts observed outcomes. Additionally, we discuss both procedural and modeling solutions to reduce model deterioration over time based on our empirical evaluations.

Cite this Paper


BibTeX
@InProceedings{pmlr-v182-adam22a, title = {Error Amplification When Updating Deployed Machine Learning Models}, author = {Adam, George Alexandru and Chang, Chun-Hao Kingsley and Haibe-Kains, Benjamin and Goldenberg, Anna}, booktitle = {Proceedings of the 7th Machine Learning for Healthcare Conference}, pages = {715--740}, year = {2022}, editor = {Lipton, Zachary and Ranganath, Rajesh and Sendak, Mark and Sjoding, Michael and Yeung, Serena}, volume = {182}, series = {Proceedings of Machine Learning Research}, month = {05--06 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v182/adam22a/adam22a.pdf}, url = {https://proceedings.mlr.press/v182/adam22a.html}, abstract = {As machine learning (ML) shows vast potential in real world applications, the number of deployed models has been increasing substantially, but little attention has been devoted to validating/improving model performance over time. Model updates, sometimes frequent, are essential to dealing with data shift, policy changes and in general, to improving the model performance. Updating also presents significant risks to amplifying model errors if effort is not put into preventing it. Unfortunately very little analysis is done to date of what can happen as models are deployed and become a part of the decision making process, where there is no longer a way to disentangle human from machine error in the collected labels. The phenomenon of interest is termed error amplification where model errors corrupt future labels, and are reinforced by updates eventually causing the model to predict its own outputs instead of the labels of interest. We analyze various factors influencing the magnitude of error amplification, and provide guidance for model and threshold selection when error amplification is a risk. We demonstrate that a variety of learning techniques cannot handle the systematic way in which error amplification corrupts observed outcomes. Additionally, we discuss both procedural and modeling solutions to reduce model deterioration over time based on our empirical evaluations.} }
Endnote
%0 Conference Paper %T Error Amplification When Updating Deployed Machine Learning Models %A George Alexandru Adam %A Chun-Hao Kingsley Chang %A Benjamin Haibe-Kains %A Anna Goldenberg %B Proceedings of the 7th Machine Learning for Healthcare Conference %C Proceedings of Machine Learning Research %D 2022 %E Zachary Lipton %E Rajesh Ranganath %E Mark Sendak %E Michael Sjoding %E Serena Yeung %F pmlr-v182-adam22a %I PMLR %P 715--740 %U https://proceedings.mlr.press/v182/adam22a.html %V 182 %X As machine learning (ML) shows vast potential in real world applications, the number of deployed models has been increasing substantially, but little attention has been devoted to validating/improving model performance over time. Model updates, sometimes frequent, are essential to dealing with data shift, policy changes and in general, to improving the model performance. Updating also presents significant risks to amplifying model errors if effort is not put into preventing it. Unfortunately very little analysis is done to date of what can happen as models are deployed and become a part of the decision making process, where there is no longer a way to disentangle human from machine error in the collected labels. The phenomenon of interest is termed error amplification where model errors corrupt future labels, and are reinforced by updates eventually causing the model to predict its own outputs instead of the labels of interest. We analyze various factors influencing the magnitude of error amplification, and provide guidance for model and threshold selection when error amplification is a risk. We demonstrate that a variety of learning techniques cannot handle the systematic way in which error amplification corrupts observed outcomes. Additionally, we discuss both procedural and modeling solutions to reduce model deterioration over time based on our empirical evaluations.
APA
Adam, G.A., Chang, C.K., Haibe-Kains, B. & Goldenberg, A.. (2022). Error Amplification When Updating Deployed Machine Learning Models. Proceedings of the 7th Machine Learning for Healthcare Conference, in Proceedings of Machine Learning Research 182:715-740 Available from https://proceedings.mlr.press/v182/adam22a.html.

Related Material