Understanding Pathologies of Deep Heteroskedastic Regression

Eliot Wong-Toi, Alex Boyd, Vincent Fortuin, Stephan Mandt
Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, PMLR 244:3765-3790, 2024.

Abstract

Deep, overparameterized regression models are notorious for their tendency to overfit. This problem is exacerbated in heteroskedastic models, which predict both mean and residual noise for each data point. At one extreme, these models fit all training data perfectly, eliminating residual noise entirely; at the other, they overfit the residual noise while predicting a constant, uninformative mean. We observe a lack of middle ground, suggesting a phase transition dependent on model regularization strength. Empirical verification supports this conjecture by fitting numerous models with varying mean and variance regularization. To explain the transition, we develop a theoretical framework based on a statistical field theory, yielding qualitative agreement with experiments. As a practical consequence, our analysis simplifies hyperparameter tuning from a two-dimensional to a one-dimensional search, substantially reducing the computational burden. Experiments on diverse datasets, including UCI datasets and the large-scale ClimSim climate dataset, demonstrate significantly improved performance in various calibration tasks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v244-wong-toi24a, title = {Understanding Pathologies of Deep Heteroskedastic Regression}, author = {Wong-Toi, Eliot and Boyd, Alex and Fortuin, Vincent and Mandt, Stephan}, booktitle = {Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence}, pages = {3765--3790}, year = {2024}, editor = {Kiyavash, Negar and Mooij, Joris M.}, volume = {244}, series = {Proceedings of Machine Learning Research}, month = {15--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v244/main/assets/wong-toi24a/wong-toi24a.pdf}, url = {https://proceedings.mlr.press/v244/wong-toi24a.html}, abstract = {Deep, overparameterized regression models are notorious for their tendency to overfit. This problem is exacerbated in heteroskedastic models, which predict both mean and residual noise for each data point. At one extreme, these models fit all training data perfectly, eliminating residual noise entirely; at the other, they overfit the residual noise while predicting a constant, uninformative mean. We observe a lack of middle ground, suggesting a phase transition dependent on model regularization strength. Empirical verification supports this conjecture by fitting numerous models with varying mean and variance regularization. To explain the transition, we develop a theoretical framework based on a statistical field theory, yielding qualitative agreement with experiments. As a practical consequence, our analysis simplifies hyperparameter tuning from a two-dimensional to a one-dimensional search, substantially reducing the computational burden. Experiments on diverse datasets, including UCI datasets and the large-scale ClimSim climate dataset, demonstrate significantly improved performance in various calibration tasks.} }
Endnote
%0 Conference Paper %T Understanding Pathologies of Deep Heteroskedastic Regression %A Eliot Wong-Toi %A Alex Boyd %A Vincent Fortuin %A Stephan Mandt %B Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2024 %E Negar Kiyavash %E Joris M. Mooij %F pmlr-v244-wong-toi24a %I PMLR %P 3765--3790 %U https://proceedings.mlr.press/v244/wong-toi24a.html %V 244 %X Deep, overparameterized regression models are notorious for their tendency to overfit. This problem is exacerbated in heteroskedastic models, which predict both mean and residual noise for each data point. At one extreme, these models fit all training data perfectly, eliminating residual noise entirely; at the other, they overfit the residual noise while predicting a constant, uninformative mean. We observe a lack of middle ground, suggesting a phase transition dependent on model regularization strength. Empirical verification supports this conjecture by fitting numerous models with varying mean and variance regularization. To explain the transition, we develop a theoretical framework based on a statistical field theory, yielding qualitative agreement with experiments. As a practical consequence, our analysis simplifies hyperparameter tuning from a two-dimensional to a one-dimensional search, substantially reducing the computational burden. Experiments on diverse datasets, including UCI datasets and the large-scale ClimSim climate dataset, demonstrate significantly improved performance in various calibration tasks.
APA
Wong-Toi, E., Boyd, A., Fortuin, V. & Mandt, S.. (2024). Understanding Pathologies of Deep Heteroskedastic Regression. Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 244:3765-3790 Available from https://proceedings.mlr.press/v244/wong-toi24a.html.

Related Material