Improving predictions of Bayesian neural nets via local linearization

Alexander Immer, Maciej Korzepa, Matthias Bauer
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:703-711, 2021.

Abstract

The generalized Gauss-Newton (GGN) approximation is often used to make practical Bayesian deep learning approaches scalable by replacing a second order derivative with a product of first order derivatives. In this paper we argue that the GGN approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN), which turns the BNN into a generalized linear model (GLM). Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one. We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation. It extends previous results in this vein to general likelihoods and has an equivalent Gaussian process formulation, which enables alternative inference schemes for BNNs in function space. We demonstrate the effectiveness of our approach on several standard classification datasets as well as on out-of-distribution detection. We provide an implementation at https://github.com/AlexImmer/BNN-predictions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-immer21a, title = { Improving predictions of Bayesian neural nets via local linearization }, author = {Immer, Alexander and Korzepa, Maciej and Bauer, Matthias}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {703--711}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/immer21a/immer21a.pdf}, url = {https://proceedings.mlr.press/v130/immer21a.html}, abstract = { The generalized Gauss-Newton (GGN) approximation is often used to make practical Bayesian deep learning approaches scalable by replacing a second order derivative with a product of first order derivatives. In this paper we argue that the GGN approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN), which turns the BNN into a generalized linear model (GLM). Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one. We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation. It extends previous results in this vein to general likelihoods and has an equivalent Gaussian process formulation, which enables alternative inference schemes for BNNs in function space. We demonstrate the effectiveness of our approach on several standard classification datasets as well as on out-of-distribution detection. We provide an implementation at https://github.com/AlexImmer/BNN-predictions. } }
Endnote
%0 Conference Paper %T Improving predictions of Bayesian neural nets via local linearization %A Alexander Immer %A Maciej Korzepa %A Matthias Bauer %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-immer21a %I PMLR %P 703--711 %U https://proceedings.mlr.press/v130/immer21a.html %V 130 %X The generalized Gauss-Newton (GGN) approximation is often used to make practical Bayesian deep learning approaches scalable by replacing a second order derivative with a product of first order derivatives. In this paper we argue that the GGN approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN), which turns the BNN into a generalized linear model (GLM). Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one. We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation. It extends previous results in this vein to general likelihoods and has an equivalent Gaussian process formulation, which enables alternative inference schemes for BNNs in function space. We demonstrate the effectiveness of our approach on several standard classification datasets as well as on out-of-distribution detection. We provide an implementation at https://github.com/AlexImmer/BNN-predictions.
APA
Immer, A., Korzepa, M. & Bauer, M.. (2021). Improving predictions of Bayesian neural nets via local linearization . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:703-711 Available from https://proceedings.mlr.press/v130/immer21a.html.

Related Material