Adapting the Linearised Laplace Model Evidence for Modern Deep Learning

Javier Antoran, David Janz, James U Allingham, Erik Daxberger, Riccardo Rb Barbano, Eric Nalisnick, Jose Miguel Hernandez-Lobato
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:796-821, 2022.

Abstract

The linearised Laplace method for estimating model uncertainty has received renewed attention in the Bayesian deep learning community. The method provides reliable error bars and admits a closed-form expression for the model evidence, allowing for scalable selection of model hyperparameters. In this work, we examine the assumptions behind this method, particularly in conjunction with model selection. We show that these interact poorly with some now-standard tools of deep learning–stochastic approximation methods and normalisation layers–and make recommendations for how to better adapt this classic method to the modern setting. We provide theoretical support for our recommendations and validate them empirically on MLPs, classic CNNs, residual networks with and without normalisation layers, generative autoencoders and transformers.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-antoran22a, title = {Adapting the Linearised {L}aplace Model Evidence for Modern Deep Learning}, author = {Antoran, Javier and Janz, David and Allingham, James U and Daxberger, Erik and Barbano, Riccardo Rb and Nalisnick, Eric and Hernandez-Lobato, Jose Miguel}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {796--821}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/antoran22a/antoran22a.pdf}, url = {https://proceedings.mlr.press/v162/antoran22a.html}, abstract = {The linearised Laplace method for estimating model uncertainty has received renewed attention in the Bayesian deep learning community. The method provides reliable error bars and admits a closed-form expression for the model evidence, allowing for scalable selection of model hyperparameters. In this work, we examine the assumptions behind this method, particularly in conjunction with model selection. We show that these interact poorly with some now-standard tools of deep learning–stochastic approximation methods and normalisation layers–and make recommendations for how to better adapt this classic method to the modern setting. We provide theoretical support for our recommendations and validate them empirically on MLPs, classic CNNs, residual networks with and without normalisation layers, generative autoencoders and transformers.} }
Endnote
%0 Conference Paper %T Adapting the Linearised Laplace Model Evidence for Modern Deep Learning %A Javier Antoran %A David Janz %A James U Allingham %A Erik Daxberger %A Riccardo Rb Barbano %A Eric Nalisnick %A Jose Miguel Hernandez-Lobato %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-antoran22a %I PMLR %P 796--821 %U https://proceedings.mlr.press/v162/antoran22a.html %V 162 %X The linearised Laplace method for estimating model uncertainty has received renewed attention in the Bayesian deep learning community. The method provides reliable error bars and admits a closed-form expression for the model evidence, allowing for scalable selection of model hyperparameters. In this work, we examine the assumptions behind this method, particularly in conjunction with model selection. We show that these interact poorly with some now-standard tools of deep learning–stochastic approximation methods and normalisation layers–and make recommendations for how to better adapt this classic method to the modern setting. We provide theoretical support for our recommendations and validate them empirically on MLPs, classic CNNs, residual networks with and without normalisation layers, generative autoencoders and transformers.
APA
Antoran, J., Janz, D., Allingham, J.U., Daxberger, E., Barbano, R.R., Nalisnick, E. & Hernandez-Lobato, J.M.. (2022). Adapting the Linearised Laplace Model Evidence for Modern Deep Learning. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:796-821 Available from https://proceedings.mlr.press/v162/antoran22a.html.

Related Material