Are all models wrong? Fundamental limits in distribution-free empirical model falsification

Manuel M. Müller, Yuetian Luo, Rina Foygel Barber
Proceedings of Thirty Eighth Conference on Learning Theory, PMLR 291:4271-4308, 2025.

Abstract

In statistics and machine learning, when we train a fitted model on available data, we typically want to ensure that we are searching within a model class that contains at least one accurate model—that is, we would like to ensure an upper bound on the \emph{model class risk} (the lowest possible risk that can be attained by any model in the class). However, it is also of interest to establish lower bounds on the model class risk, for instance so that we can determine whether our fitted model is at least approximately optimal within the class, or, so that we can decide whether the model class is unsuitable for the particular task at hand. Particularly in the setting of interpolation learning where machine learning models are trained to reach zero error on the training data, we might ask if, at the very least, a positive lower bound on the model class risk is possible—or are we unable to detect that “all models are wrong”? In this work, we answer these questions in a distribution-free setting by establishing a model-agnostic, fundamental hardness result for the problem of constructing a lower bound on the best test error achievable over a model class, and examine its implications on specific model classes such as tree-based methods and linear regression.

Cite this Paper


BibTeX
@InProceedings{pmlr-v291-muller25a, title = {Are all models wrong? Fundamental limits in distribution-free empirical model falsification}, author = {M{\"u}ller, Manuel M. and Luo, Yuetian and Barber, Rina Foygel}, booktitle = {Proceedings of Thirty Eighth Conference on Learning Theory}, pages = {4271--4308}, year = {2025}, editor = {Haghtalab, Nika and Moitra, Ankur}, volume = {291}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--04 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v291/main/assets/muller25a/muller25a.pdf}, url = {https://proceedings.mlr.press/v291/muller25a.html}, abstract = {In statistics and machine learning, when we train a fitted model on available data, we typically want to ensure that we are searching within a model class that contains at least one accurate model—that is, we would like to ensure an upper bound on the \emph{model class risk} (the lowest possible risk that can be attained by any model in the class). However, it is also of interest to establish lower bounds on the model class risk, for instance so that we can determine whether our fitted model is at least approximately optimal within the class, or, so that we can decide whether the model class is unsuitable for the particular task at hand. Particularly in the setting of interpolation learning where machine learning models are trained to reach zero error on the training data, we might ask if, at the very least, a positive lower bound on the model class risk is possible—or are we unable to detect that “all models are wrong”? In this work, we answer these questions in a distribution-free setting by establishing a model-agnostic, fundamental hardness result for the problem of constructing a lower bound on the best test error achievable over a model class, and examine its implications on specific model classes such as tree-based methods and linear regression.} }
Endnote
%0 Conference Paper %T Are all models wrong? Fundamental limits in distribution-free empirical model falsification %A Manuel M. Müller %A Yuetian Luo %A Rina Foygel Barber %B Proceedings of Thirty Eighth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2025 %E Nika Haghtalab %E Ankur Moitra %F pmlr-v291-muller25a %I PMLR %P 4271--4308 %U https://proceedings.mlr.press/v291/muller25a.html %V 291 %X In statistics and machine learning, when we train a fitted model on available data, we typically want to ensure that we are searching within a model class that contains at least one accurate model—that is, we would like to ensure an upper bound on the \emph{model class risk} (the lowest possible risk that can be attained by any model in the class). However, it is also of interest to establish lower bounds on the model class risk, for instance so that we can determine whether our fitted model is at least approximately optimal within the class, or, so that we can decide whether the model class is unsuitable for the particular task at hand. Particularly in the setting of interpolation learning where machine learning models are trained to reach zero error on the training data, we might ask if, at the very least, a positive lower bound on the model class risk is possible—or are we unable to detect that “all models are wrong”? In this work, we answer these questions in a distribution-free setting by establishing a model-agnostic, fundamental hardness result for the problem of constructing a lower bound on the best test error achievable over a model class, and examine its implications on specific model classes such as tree-based methods and linear regression.
APA
Müller, M.M., Luo, Y. & Barber, R.F.. (2025). Are all models wrong? Fundamental limits in distribution-free empirical model falsification. Proceedings of Thirty Eighth Conference on Learning Theory, in Proceedings of Machine Learning Research 291:4271-4308 Available from https://proceedings.mlr.press/v291/muller25a.html.

Related Material