[edit]
Error Amplification When Updating Deployed Machine Learning Models
Proceedings of the 7th Machine Learning for Healthcare Conference, PMLR 182:715-740, 2022.
Abstract
As machine learning (ML) shows vast potential in real world applications, the number of deployed models has been increasing substantially, but little attention has been devoted to validating/improving model performance over time. Model updates, sometimes frequent, are essential to dealing with data shift, policy changes and in general, to improving the model performance. Updating also presents significant risks to amplifying model errors if effort is not put into preventing it. Unfortunately very little analysis is done to date of what can happen as models are deployed and become a part of the decision making process, where there is no longer a way to disentangle human from machine error in the collected labels. The phenomenon of interest is termed error amplification where model errors corrupt future labels, and are reinforced by updates eventually causing the model to predict its own outputs instead of the labels of interest. We analyze various factors influencing the magnitude of error amplification, and provide guidance for model and threshold selection when error amplification is a risk. We demonstrate that a variety of learning techniques cannot handle the systematic way in which error amplification corrupts observed outcomes. Additionally, we discuss both procedural and modeling solutions to reduce model deterioration over time based on our empirical evaluations.