Learning Structured Weight Uncertainty in Bayesian Neural Networks

Shengyang Sun, Changyou Chen, Lawrence Carin
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR 54:1283-1292, 2017.

Abstract

Deep neural networks (DNNs) are increasingly popular in modern machine learning. Bayesian learning affords the opportunity to quantify posterior uncertainty on DNN model parameters. Most existing work adopts independent Gaussian priors on the model weights, ignoring possible structural information. In this paper, we consider the matrix variate Gaussian (MVG) distribution to model structured correlations within the weights of a DNN. To make posterior inference feasible, a reparametrization is proposed for the MVG prior, simplifying the complex MVG-based model to an equivalent yet simpler model with independent Gaussian priors on the transformed weights. Consequently, we develop a scalable Bayesian online inference algorithm by adopting the recently proposed probabilistic backpropagation framework. Experiments on several synthetic and real datasets indicate the superiority of our model, achieving competitive performance in terms of model likelihood and predictive root mean square error. Importantly, it also yields faster convergence speed compared to related Bayesian DNN models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v54-sun17b, title = {{Learning Structured Weight Uncertainty in Bayesian Neural Networks}}, author = {Sun, Shengyang and Chen, Changyou and Carin, Lawrence}, booktitle = {Proceedings of the 20th International Conference on Artificial Intelligence and Statistics}, pages = {1283--1292}, year = {2017}, editor = {Singh, Aarti and Zhu, Jerry}, volume = {54}, series = {Proceedings of Machine Learning Research}, month = {20--22 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v54/sun17b/sun17b.pdf}, url = {https://proceedings.mlr.press/v54/sun17b.html}, abstract = {Deep neural networks (DNNs) are increasingly popular in modern machine learning. Bayesian learning affords the opportunity to quantify posterior uncertainty on DNN model parameters. Most existing work adopts independent Gaussian priors on the model weights, ignoring possible structural information. In this paper, we consider the matrix variate Gaussian (MVG) distribution to model structured correlations within the weights of a DNN. To make posterior inference feasible, a reparametrization is proposed for the MVG prior, simplifying the complex MVG-based model to an equivalent yet simpler model with independent Gaussian priors on the transformed weights. Consequently, we develop a scalable Bayesian online inference algorithm by adopting the recently proposed probabilistic backpropagation framework. Experiments on several synthetic and real datasets indicate the superiority of our model, achieving competitive performance in terms of model likelihood and predictive root mean square error. Importantly, it also yields faster convergence speed compared to related Bayesian DNN models.} }
Endnote
%0 Conference Paper %T Learning Structured Weight Uncertainty in Bayesian Neural Networks %A Shengyang Sun %A Changyou Chen %A Lawrence Carin %B Proceedings of the 20th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2017 %E Aarti Singh %E Jerry Zhu %F pmlr-v54-sun17b %I PMLR %P 1283--1292 %U https://proceedings.mlr.press/v54/sun17b.html %V 54 %X Deep neural networks (DNNs) are increasingly popular in modern machine learning. Bayesian learning affords the opportunity to quantify posterior uncertainty on DNN model parameters. Most existing work adopts independent Gaussian priors on the model weights, ignoring possible structural information. In this paper, we consider the matrix variate Gaussian (MVG) distribution to model structured correlations within the weights of a DNN. To make posterior inference feasible, a reparametrization is proposed for the MVG prior, simplifying the complex MVG-based model to an equivalent yet simpler model with independent Gaussian priors on the transformed weights. Consequently, we develop a scalable Bayesian online inference algorithm by adopting the recently proposed probabilistic backpropagation framework. Experiments on several synthetic and real datasets indicate the superiority of our model, achieving competitive performance in terms of model likelihood and predictive root mean square error. Importantly, it also yields faster convergence speed compared to related Bayesian DNN models.
APA
Sun, S., Chen, C. & Carin, L.. (2017). Learning Structured Weight Uncertainty in Bayesian Neural Networks. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 54:1283-1292 Available from https://proceedings.mlr.press/v54/sun17b.html.

Related Material