Bayesian Deep Learning via Subnetwork Inference

Erik Daxberger, Eric Nalisnick, James U Allingham, Javier Antoran, Jose Miguel Hernandez-Lobato
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:2510-2521, 2021.

Abstract

The Bayesian paradigm has the potential to solve core issues of deep neural networks such as poor calibration and data inefficiency. Alas, scaling Bayesian inference to large weight spaces often requires restrictive approximations. In this work, we show that it suffices to perform inference over a small subset of model weights in order to obtain accurate predictive posteriors. The other weights are kept as point estimates. This subnetwork inference framework enables us to use expressive, otherwise intractable, posterior approximations over such subsets. In particular, we implement subnetwork linearized Laplace as a simple, scalable Bayesian deep learning method: We first obtain a MAP estimate of all weights and then infer a full-covariance Gaussian posterior over a subnetwork using the linearized Laplace approximation. We propose a subnetwork selection strategy that aims to maximally preserve the model’s predictive uncertainty. Empirically, our approach compares favorably to ensembles and less expressive posterior approximations over full networks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-daxberger21a, title = {Bayesian Deep Learning via Subnetwork Inference}, author = {Daxberger, Erik and Nalisnick, Eric and Allingham, James U and Antoran, Javier and Hernandez-Lobato, Jose Miguel}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {2510--2521}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/daxberger21a/daxberger21a.pdf}, url = {https://proceedings.mlr.press/v139/daxberger21a.html}, abstract = {The Bayesian paradigm has the potential to solve core issues of deep neural networks such as poor calibration and data inefficiency. Alas, scaling Bayesian inference to large weight spaces often requires restrictive approximations. In this work, we show that it suffices to perform inference over a small subset of model weights in order to obtain accurate predictive posteriors. The other weights are kept as point estimates. This subnetwork inference framework enables us to use expressive, otherwise intractable, posterior approximations over such subsets. In particular, we implement subnetwork linearized Laplace as a simple, scalable Bayesian deep learning method: We first obtain a MAP estimate of all weights and then infer a full-covariance Gaussian posterior over a subnetwork using the linearized Laplace approximation. We propose a subnetwork selection strategy that aims to maximally preserve the model’s predictive uncertainty. Empirically, our approach compares favorably to ensembles and less expressive posterior approximations over full networks.} }
Endnote
%0 Conference Paper %T Bayesian Deep Learning via Subnetwork Inference %A Erik Daxberger %A Eric Nalisnick %A James U Allingham %A Javier Antoran %A Jose Miguel Hernandez-Lobato %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-daxberger21a %I PMLR %P 2510--2521 %U https://proceedings.mlr.press/v139/daxberger21a.html %V 139 %X The Bayesian paradigm has the potential to solve core issues of deep neural networks such as poor calibration and data inefficiency. Alas, scaling Bayesian inference to large weight spaces often requires restrictive approximations. In this work, we show that it suffices to perform inference over a small subset of model weights in order to obtain accurate predictive posteriors. The other weights are kept as point estimates. This subnetwork inference framework enables us to use expressive, otherwise intractable, posterior approximations over such subsets. In particular, we implement subnetwork linearized Laplace as a simple, scalable Bayesian deep learning method: We first obtain a MAP estimate of all weights and then infer a full-covariance Gaussian posterior over a subnetwork using the linearized Laplace approximation. We propose a subnetwork selection strategy that aims to maximally preserve the model’s predictive uncertainty. Empirically, our approach compares favorably to ensembles and less expressive posterior approximations over full networks.
APA
Daxberger, E., Nalisnick, E., Allingham, J.U., Antoran, J. & Hernandez-Lobato, J.M.. (2021). Bayesian Deep Learning via Subnetwork Inference. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:2510-2521 Available from https://proceedings.mlr.press/v139/daxberger21a.html.

Related Material