On the Minimal Error of Empirical Risk Minimization

Gil Kur, Alexander Rakhlin
Proceedings of Thirty Fourth Conference on Learning Theory, PMLR 134:2849-2852, 2021.

Abstract

We study the minimal squared error of the Empirical Risk Minimization (ERM) procedure in the task of regression, both in random and fixed design settings. Our sharp lower bounds shed light on the possibility (or impossibility) of adapting to simplicity of the model generating the data. In the fixed design setting, we show that the error is governed by the global complexity of the entire class. In contrast, in random design, ERM may only adapt to simpler models if the local neighborhoods around the regression function are nearly as complex as the class itself, a somewhat counter-intuitive conclusion. We provide sharp lower bounds for performance of ERM in Donsker and non-Donsker classes. We also discuss our results through the lens of recent studies on interpolation in overparameterized models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v134-kur21a, title = {On the Minimal Error of Empirical Risk Minimization}, author = {Kur, Gil and Rakhlin, Alexander}, booktitle = {Proceedings of Thirty Fourth Conference on Learning Theory}, pages = {2849--2852}, year = {2021}, editor = {Belkin, Mikhail and Kpotufe, Samory}, volume = {134}, series = {Proceedings of Machine Learning Research}, month = {15--19 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v134/kur21a/kur21a.pdf}, url = {https://proceedings.mlr.press/v134/kur21a.html}, abstract = {We study the minimal squared error of the Empirical Risk Minimization (ERM) procedure in the task of regression, both in random and fixed design settings. Our sharp lower bounds shed light on the possibility (or impossibility) of adapting to simplicity of the model generating the data. In the fixed design setting, we show that the error is governed by the global complexity of the entire class. In contrast, in random design, ERM may only adapt to simpler models if the local neighborhoods around the regression function are nearly as complex as the class itself, a somewhat counter-intuitive conclusion. We provide sharp lower bounds for performance of ERM in Donsker and non-Donsker classes. We also discuss our results through the lens of recent studies on interpolation in overparameterized models.} }
Endnote
%0 Conference Paper %T On the Minimal Error of Empirical Risk Minimization %A Gil Kur %A Alexander Rakhlin %B Proceedings of Thirty Fourth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2021 %E Mikhail Belkin %E Samory Kpotufe %F pmlr-v134-kur21a %I PMLR %P 2849--2852 %U https://proceedings.mlr.press/v134/kur21a.html %V 134 %X We study the minimal squared error of the Empirical Risk Minimization (ERM) procedure in the task of regression, both in random and fixed design settings. Our sharp lower bounds shed light on the possibility (or impossibility) of adapting to simplicity of the model generating the data. In the fixed design setting, we show that the error is governed by the global complexity of the entire class. In contrast, in random design, ERM may only adapt to simpler models if the local neighborhoods around the regression function are nearly as complex as the class itself, a somewhat counter-intuitive conclusion. We provide sharp lower bounds for performance of ERM in Donsker and non-Donsker classes. We also discuss our results through the lens of recent studies on interpolation in overparameterized models.
APA
Kur, G. & Rakhlin, A.. (2021). On the Minimal Error of Empirical Risk Minimization. Proceedings of Thirty Fourth Conference on Learning Theory, in Proceedings of Machine Learning Research 134:2849-2852 Available from https://proceedings.mlr.press/v134/kur21a.html.

Related Material