Tackling covariate shift with node-based Bayesian neural networks

Trung Q Trinh, Markus Heinonen, Luigi Acerbi, Samuel Kaski
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:21751-21775, 2022.

Abstract

Bayesian neural networks (BNNs) promise improved generalization under covariate shift by providing principled probabilistic representations of epistemic uncertainty. However, weight-based BNNs often struggle with high computational complexity of large-scale architectures and datasets. Node-based BNNs have recently been introduced as scalable alternatives, which induce epistemic uncertainty by multiplying each hidden node with latent random variables, while learning a point-estimate of the weights. In this paper, we interpret these latent noise variables as implicit representations of simple and domain-agnostic data perturbations during training, producing BNNs that perform well under covariate shift due to input corruptions. We observe that the diversity of the implicit corruptions depends on the entropy of the latent variables, and propose a straightforward approach to increase the entropy of these variables during training. We evaluate the method on out-of-distribution image classification benchmarks, and show improved uncertainty estimation of node-based BNNs under covariate shift due to input perturbations. As a side effect, the method also provides robustness against noisy training labels.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-trinh22a, title = {Tackling covariate shift with node-based {B}ayesian neural networks}, author = {Trinh, Trung Q and Heinonen, Markus and Acerbi, Luigi and Kaski, Samuel}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {21751--21775}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/trinh22a/trinh22a.pdf}, url = {https://proceedings.mlr.press/v162/trinh22a.html}, abstract = {Bayesian neural networks (BNNs) promise improved generalization under covariate shift by providing principled probabilistic representations of epistemic uncertainty. However, weight-based BNNs often struggle with high computational complexity of large-scale architectures and datasets. Node-based BNNs have recently been introduced as scalable alternatives, which induce epistemic uncertainty by multiplying each hidden node with latent random variables, while learning a point-estimate of the weights. In this paper, we interpret these latent noise variables as implicit representations of simple and domain-agnostic data perturbations during training, producing BNNs that perform well under covariate shift due to input corruptions. We observe that the diversity of the implicit corruptions depends on the entropy of the latent variables, and propose a straightforward approach to increase the entropy of these variables during training. We evaluate the method on out-of-distribution image classification benchmarks, and show improved uncertainty estimation of node-based BNNs under covariate shift due to input perturbations. As a side effect, the method also provides robustness against noisy training labels.} }
Endnote
%0 Conference Paper %T Tackling covariate shift with node-based Bayesian neural networks %A Trung Q Trinh %A Markus Heinonen %A Luigi Acerbi %A Samuel Kaski %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-trinh22a %I PMLR %P 21751--21775 %U https://proceedings.mlr.press/v162/trinh22a.html %V 162 %X Bayesian neural networks (BNNs) promise improved generalization under covariate shift by providing principled probabilistic representations of epistemic uncertainty. However, weight-based BNNs often struggle with high computational complexity of large-scale architectures and datasets. Node-based BNNs have recently been introduced as scalable alternatives, which induce epistemic uncertainty by multiplying each hidden node with latent random variables, while learning a point-estimate of the weights. In this paper, we interpret these latent noise variables as implicit representations of simple and domain-agnostic data perturbations during training, producing BNNs that perform well under covariate shift due to input corruptions. We observe that the diversity of the implicit corruptions depends on the entropy of the latent variables, and propose a straightforward approach to increase the entropy of these variables during training. We evaluate the method on out-of-distribution image classification benchmarks, and show improved uncertainty estimation of node-based BNNs under covariate shift due to input perturbations. As a side effect, the method also provides robustness against noisy training labels.
APA
Trinh, T.Q., Heinonen, M., Acerbi, L. & Kaski, S.. (2022). Tackling covariate shift with node-based Bayesian neural networks. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:21751-21775 Available from https://proceedings.mlr.press/v162/trinh22a.html.

Related Material