Bayes without Underfitting: Fully Correlated Deep Learning Posteriors via Alternating Projections

Marco Miani, Hrittik Roy, Søren Hauberg
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:2062-2070, 2025.

Abstract

Bayesian deep learning all too often underfits so that the Bayesian prediction is less accurate than a simple point estimate. Uncertainty quantification then comes at the cost of accuracy. For linearized models, the null space of the generalized Gauss-Newton matrix corresponds to parameters that preserve the training predictions of the point estimate. We propose to build Bayesian approximations in this null space, thereby guaranteeing that the Bayesian predictive does not underfit. We suggest a matrix-free algorithm for projecting onto this null space, which scales linearly with the number of parameters and quadratically with the number of output dimensions. We further propose an approximation that only scales linearly with parameters to make the method applicable to generative models. An extensive empirical evaluation shows that the approach scales to large models, including vision transformers with 28 million parameters. Code is available at: \url{https://github.com/h-roy/projected-bayes}

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-miani25a, title = {Bayes without Underfitting: Fully Correlated Deep Learning Posteriors via Alternating Projections}, author = {Miani, Marco and Roy, Hrittik and Hauberg, S{\o}ren}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {2062--2070}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/miani25a/miani25a.pdf}, url = {https://proceedings.mlr.press/v258/miani25a.html}, abstract = {Bayesian deep learning all too often underfits so that the Bayesian prediction is less accurate than a simple point estimate. Uncertainty quantification then comes at the cost of accuracy. For linearized models, the null space of the generalized Gauss-Newton matrix corresponds to parameters that preserve the training predictions of the point estimate. We propose to build Bayesian approximations in this null space, thereby guaranteeing that the Bayesian predictive does not underfit. We suggest a matrix-free algorithm for projecting onto this null space, which scales linearly with the number of parameters and quadratically with the number of output dimensions. We further propose an approximation that only scales linearly with parameters to make the method applicable to generative models. An extensive empirical evaluation shows that the approach scales to large models, including vision transformers with 28 million parameters. Code is available at: \url{https://github.com/h-roy/projected-bayes}} }
Endnote
%0 Conference Paper %T Bayes without Underfitting: Fully Correlated Deep Learning Posteriors via Alternating Projections %A Marco Miani %A Hrittik Roy %A Søren Hauberg %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-miani25a %I PMLR %P 2062--2070 %U https://proceedings.mlr.press/v258/miani25a.html %V 258 %X Bayesian deep learning all too often underfits so that the Bayesian prediction is less accurate than a simple point estimate. Uncertainty quantification then comes at the cost of accuracy. For linearized models, the null space of the generalized Gauss-Newton matrix corresponds to parameters that preserve the training predictions of the point estimate. We propose to build Bayesian approximations in this null space, thereby guaranteeing that the Bayesian predictive does not underfit. We suggest a matrix-free algorithm for projecting onto this null space, which scales linearly with the number of parameters and quadratically with the number of output dimensions. We further propose an approximation that only scales linearly with parameters to make the method applicable to generative models. An extensive empirical evaluation shows that the approach scales to large models, including vision transformers with 28 million parameters. Code is available at: \url{https://github.com/h-roy/projected-bayes}
APA
Miani, M., Roy, H. & Hauberg, S.. (2025). Bayes without Underfitting: Fully Correlated Deep Learning Posteriors via Alternating Projections. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:2062-2070 Available from https://proceedings.mlr.press/v258/miani25a.html.

Related Material