ELBO, regularized maximum likelihood, and their common one-sample approximation for training stochastic neural networks

Sina Däubener, Simon Damm, Asja Fischer
Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, PMLR 286:897-914, 2025.

Abstract

Monte Carlo approximations are central to the training of stochastic neural networks in general, and Bayesian neural networks (BNNs) in particular. We observe that the common one-sample approximation of the standard training objective can be viewed both as maximizing the Evidence Lower Bound (ELBO) and as maximizing a regularized log-likelihood of a compound distribution. This latter approach differs from the ELBO only in the order of the logarithm and expectation, and is theoretically grounded in PAC-Bayes theory. We argue theoretically and demonstrate empirically that training with the regularized maximum likelihood increases prediction variance, enhancing performance in misspecified settings, adversarial robustness, and strengthening out-of-distribution (OOD) detection. Our findings help reconcile previous contradictions in the literature by providing a detailed analysis of how training objectives and Monte Carlo sample sizes affect uncertainty quantification in stochastic neural networks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v286-daubener25a, title = {ELBO, regularized maximum likelihood, and their common one-sample approximation for training stochastic neural networks}, author = {D\"{a}ubener, Sina and Damm, Simon and Fischer, Asja}, booktitle = {Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence}, pages = {897--914}, year = {2025}, editor = {Chiappa, Silvia and Magliacane, Sara}, volume = {286}, series = {Proceedings of Machine Learning Research}, month = {21--25 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v286/main/assets/daubener25a/daubener25a.pdf}, url = {https://proceedings.mlr.press/v286/daubener25a.html}, abstract = {Monte Carlo approximations are central to the training of stochastic neural networks in general, and Bayesian neural networks (BNNs) in particular. We observe that the common one-sample approximation of the standard training objective can be viewed both as maximizing the Evidence Lower Bound (ELBO) and as maximizing a regularized log-likelihood of a compound distribution. This latter approach differs from the ELBO only in the order of the logarithm and expectation, and is theoretically grounded in PAC-Bayes theory. We argue theoretically and demonstrate empirically that training with the regularized maximum likelihood increases prediction variance, enhancing performance in misspecified settings, adversarial robustness, and strengthening out-of-distribution (OOD) detection. Our findings help reconcile previous contradictions in the literature by providing a detailed analysis of how training objectives and Monte Carlo sample sizes affect uncertainty quantification in stochastic neural networks.} }
Endnote
%0 Conference Paper %T ELBO, regularized maximum likelihood, and their common one-sample approximation for training stochastic neural networks %A Sina Däubener %A Simon Damm %A Asja Fischer %B Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2025 %E Silvia Chiappa %E Sara Magliacane %F pmlr-v286-daubener25a %I PMLR %P 897--914 %U https://proceedings.mlr.press/v286/daubener25a.html %V 286 %X Monte Carlo approximations are central to the training of stochastic neural networks in general, and Bayesian neural networks (BNNs) in particular. We observe that the common one-sample approximation of the standard training objective can be viewed both as maximizing the Evidence Lower Bound (ELBO) and as maximizing a regularized log-likelihood of a compound distribution. This latter approach differs from the ELBO only in the order of the logarithm and expectation, and is theoretically grounded in PAC-Bayes theory. We argue theoretically and demonstrate empirically that training with the regularized maximum likelihood increases prediction variance, enhancing performance in misspecified settings, adversarial robustness, and strengthening out-of-distribution (OOD) detection. Our findings help reconcile previous contradictions in the literature by providing a detailed analysis of how training objectives and Monte Carlo sample sizes affect uncertainty quantification in stochastic neural networks.
APA
Däubener, S., Damm, S. & Fischer, A.. (2025). ELBO, regularized maximum likelihood, and their common one-sample approximation for training stochastic neural networks. Proceedings of the Forty-first Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 286:897-914 Available from https://proceedings.mlr.press/v286/daubener25a.html.

Related Material