On the Asymptotic Distribution of the Minimum Empirical Risk

Jacob Westerhout, Trungtin Nguyen, Xin Guo, Hien Duy Nguyen
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:52869-52902, 2024.

Abstract

Empirical risk minimization (ERM) is a foundational framework for the estimation of solutions to statistical and machine learning problems. Characterizing the distributional properties of the minimum empirical risk (MER) provides valuable tools for conducting inference and assessing the goodness of model fit. We provide a comprehensive account of the asymptotic distribution for the order-$\sqrt{n}$ blowup of the MER under generic and abstract assumptions, and present practical conditions under which our theorems hold. Our results improve upon and relax the assumptions made in previous works. Specifically, we provide asymptotic distributions for MERs for non-independent and identically distributed data, and when the loss functions may be discontinuous or indexed by non-Euclidean spaces. We further present results that enable the application of these asymptotics for statistical inference. Specifically, the construction of consistent confidence sets using the bootstrap and consistent hypothesis tests using penalized model selection. We illustrate the utility of our approach by applying our results to neural network problems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-westerhout24a, title = {On the Asymptotic Distribution of the Minimum Empirical Risk}, author = {Westerhout, Jacob and Nguyen, Trungtin and Guo, Xin and Nguyen, Hien Duy}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {52869--52902}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/westerhout24a/westerhout24a.pdf}, url = {https://proceedings.mlr.press/v235/westerhout24a.html}, abstract = {Empirical risk minimization (ERM) is a foundational framework for the estimation of solutions to statistical and machine learning problems. Characterizing the distributional properties of the minimum empirical risk (MER) provides valuable tools for conducting inference and assessing the goodness of model fit. We provide a comprehensive account of the asymptotic distribution for the order-$\sqrt{n}$ blowup of the MER under generic and abstract assumptions, and present practical conditions under which our theorems hold. Our results improve upon and relax the assumptions made in previous works. Specifically, we provide asymptotic distributions for MERs for non-independent and identically distributed data, and when the loss functions may be discontinuous or indexed by non-Euclidean spaces. We further present results that enable the application of these asymptotics for statistical inference. Specifically, the construction of consistent confidence sets using the bootstrap and consistent hypothesis tests using penalized model selection. We illustrate the utility of our approach by applying our results to neural network problems.} }
Endnote
%0 Conference Paper %T On the Asymptotic Distribution of the Minimum Empirical Risk %A Jacob Westerhout %A Trungtin Nguyen %A Xin Guo %A Hien Duy Nguyen %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-westerhout24a %I PMLR %P 52869--52902 %U https://proceedings.mlr.press/v235/westerhout24a.html %V 235 %X Empirical risk minimization (ERM) is a foundational framework for the estimation of solutions to statistical and machine learning problems. Characterizing the distributional properties of the minimum empirical risk (MER) provides valuable tools for conducting inference and assessing the goodness of model fit. We provide a comprehensive account of the asymptotic distribution for the order-$\sqrt{n}$ blowup of the MER under generic and abstract assumptions, and present practical conditions under which our theorems hold. Our results improve upon and relax the assumptions made in previous works. Specifically, we provide asymptotic distributions for MERs for non-independent and identically distributed data, and when the loss functions may be discontinuous or indexed by non-Euclidean spaces. We further present results that enable the application of these asymptotics for statistical inference. Specifically, the construction of consistent confidence sets using the bootstrap and consistent hypothesis tests using penalized model selection. We illustrate the utility of our approach by applying our results to neural network problems.
APA
Westerhout, J., Nguyen, T., Guo, X. & Nguyen, H.D.. (2024). On the Asymptotic Distribution of the Minimum Empirical Risk. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:52869-52902 Available from https://proceedings.mlr.press/v235/westerhout24a.html.

Related Material