"Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts

Haoran Zhang, Harvineet Singh, Marzyeh Ghassemi, Shalmali Joshi
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:41550-41578, 2023.

Abstract

Machine learning models frequently experience performance drops under distribution shifts. The underlying cause of such shifts may be multiple simultaneous factors such as changes in data quality, differences in specific covariate distributions, or changes in the relationship between label and features. When a model does fail during deployment, attributing performance change to these factors is critical for the model developer to identify the root cause and take mitigating actions. In this work, we introduce the problem of attributing performance differences between environments to distribution shifts in the underlying data generating mechanisms. We formulate the problem as a cooperative game where the players are distributions. We define the value of a set of distributions to be the change in model performance when only this set of distributions has changed between environments, and derive an importance weighting method for computing the value of an arbitrary set of distributions. The contribution of each distribution to the total performance change is then quantified as its Shapley value. We demonstrate the correctness and utility of our method on synthetic, semi-synthetic, and real-world case studies, showing its effectiveness in attributing performance changes to a wide range of distribution shifts.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-zhang23ai, title = {"{W}hy did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts}, author = {Zhang, Haoran and Singh, Harvineet and Ghassemi, Marzyeh and Joshi, Shalmali}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {41550--41578}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/zhang23ai/zhang23ai.pdf}, url = {https://proceedings.mlr.press/v202/zhang23ai.html}, abstract = {Machine learning models frequently experience performance drops under distribution shifts. The underlying cause of such shifts may be multiple simultaneous factors such as changes in data quality, differences in specific covariate distributions, or changes in the relationship between label and features. When a model does fail during deployment, attributing performance change to these factors is critical for the model developer to identify the root cause and take mitigating actions. In this work, we introduce the problem of attributing performance differences between environments to distribution shifts in the underlying data generating mechanisms. We formulate the problem as a cooperative game where the players are distributions. We define the value of a set of distributions to be the change in model performance when only this set of distributions has changed between environments, and derive an importance weighting method for computing the value of an arbitrary set of distributions. The contribution of each distribution to the total performance change is then quantified as its Shapley value. We demonstrate the correctness and utility of our method on synthetic, semi-synthetic, and real-world case studies, showing its effectiveness in attributing performance changes to a wide range of distribution shifts.} }
Endnote
%0 Conference Paper %T "Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts %A Haoran Zhang %A Harvineet Singh %A Marzyeh Ghassemi %A Shalmali Joshi %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-zhang23ai %I PMLR %P 41550--41578 %U https://proceedings.mlr.press/v202/zhang23ai.html %V 202 %X Machine learning models frequently experience performance drops under distribution shifts. The underlying cause of such shifts may be multiple simultaneous factors such as changes in data quality, differences in specific covariate distributions, or changes in the relationship between label and features. When a model does fail during deployment, attributing performance change to these factors is critical for the model developer to identify the root cause and take mitigating actions. In this work, we introduce the problem of attributing performance differences between environments to distribution shifts in the underlying data generating mechanisms. We formulate the problem as a cooperative game where the players are distributions. We define the value of a set of distributions to be the change in model performance when only this set of distributions has changed between environments, and derive an importance weighting method for computing the value of an arbitrary set of distributions. The contribution of each distribution to the total performance change is then quantified as its Shapley value. We demonstrate the correctness and utility of our method on synthetic, semi-synthetic, and real-world case studies, showing its effectiveness in attributing performance changes to a wide range of distribution shifts.
APA
Zhang, H., Singh, H., Ghassemi, M. & Joshi, S.. (2023). "Why did the Model Fail?": Attributing Model Performance Changes to Distribution Shifts. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:41550-41578 Available from https://proceedings.mlr.press/v202/zhang23ai.html.

Related Material