Structured and Efficient Variational Deep Learning with Matrix Gaussian Posteriors

Christos Louizos; Max Welling

Structured and Efficient Variational Deep Learning with Matrix Gaussian Posteriors

Christos Louizos, Max Welling

Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:1708-1716, 2016.

Abstract

We introduce a variational Bayesian neural network where the parameters are governed via a probability distribution on random matrices. Specifically, we employ a matrix variate Gaussian (Gupta & Nagar ’99) parameter posterior distribution where we explicitly model the covariance among the input and output dimensions of each layer. Furthermore, with approximate covariance matrices we can achieve a more efficient way to represent those correlations that is also cheaper than fully factorized parameter posteriors. We further show that with the “local reprarametrization trick" (Kingma & Welling ’15) on this posterior distribution we arrive at a Gaussian Process (Rasmussen ’06) interpretation of the hidden units in each layer and we, similarly with (Gal & Ghahramani ’15), provide connections with deep Gaussian processes. We continue in taking advantage of this duality and incorporate “pseudo-data” (Snelson & Ghahramani ’05) in our model, which in turn allows for more efficient posterior sampling while maintaining the properties of the original model. The validity of the proposed approach is verified through extensive experiments.

Cite this Paper

BibTeX


@InProceedings{pmlr-v48-louizos16,
  title = 	 {Structured and Efficient Variational Deep Learning with Matrix Gaussian Posteriors},
  author = 	 {Louizos, Christos and Welling, Max},
  booktitle = 	 {Proceedings of The 33rd International Conference on Machine Learning},
  pages = 	 {1708--1716},
  year = 	 {2016},
  editor = 	 {Balcan, Maria Florina and Weinberger, Kilian Q.},
  volume = 	 {48},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {20--22 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v48/louizos16.pdf},
  url = 	 {https://proceedings.mlr.press/v48/louizos16.html},
  abstract = 	 {We introduce a variational Bayesian neural network where the parameters are governed via a probability distribution on random matrices. Specifically, we employ a matrix variate Gaussian (Gupta & Nagar ’99) parameter posterior distribution where we explicitly model the covariance among the input and output dimensions of each layer. Furthermore, with approximate covariance matrices we can achieve a more efficient way to represent those correlations that is also cheaper than fully factorized parameter posteriors. We further show that with the “local reprarametrization trick" (Kingma & Welling ’15) on this posterior distribution we arrive at a Gaussian Process (Rasmussen ’06) interpretation of the hidden units in each layer and we, similarly with (Gal & Ghahramani ’15), provide connections with deep Gaussian processes. We continue in taking advantage of this duality and incorporate “pseudo-data” (Snelson & Ghahramani ’05) in our model, which in turn allows for more efficient posterior sampling while maintaining the properties of the original model. The validity of the proposed approach is verified through extensive experiments.}
}

Endnote

%0 Conference Paper
%T Structured and Efficient Variational Deep Learning with Matrix Gaussian Posteriors
%A Christos Louizos
%A Max Welling
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger	
%F pmlr-v48-louizos16
%I PMLR
%P 1708--1716
%U https://proceedings.mlr.press/v48/louizos16.html
%V 48
%X We introduce a variational Bayesian neural network where the parameters are governed via a probability distribution on random matrices. Specifically, we employ a matrix variate Gaussian (Gupta & Nagar ’99) parameter posterior distribution where we explicitly model the covariance among the input and output dimensions of each layer. Furthermore, with approximate covariance matrices we can achieve a more efficient way to represent those correlations that is also cheaper than fully factorized parameter posteriors. We further show that with the “local reprarametrization trick" (Kingma & Welling ’15) on this posterior distribution we arrive at a Gaussian Process (Rasmussen ’06) interpretation of the hidden units in each layer and we, similarly with (Gal & Ghahramani ’15), provide connections with deep Gaussian processes. We continue in taking advantage of this duality and incorporate “pseudo-data” (Snelson & Ghahramani ’05) in our model, which in turn allows for more efficient posterior sampling while maintaining the properties of the original model. The validity of the proposed approach is verified through extensive experiments.

RIS


TY  - CPAPER
TI  - Structured and Efficient Variational Deep Learning with Matrix Gaussian Posteriors
AU  - Christos Louizos
AU  - Max Welling
BT  - Proceedings of The 33rd International Conference on Machine Learning
DA  - 2016/06/11
ED  - Maria Florina Balcan
ED  - Kilian Q. Weinberger	
ID  - pmlr-v48-louizos16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 48
SP  - 1708
EP  - 1716
L1  - http://proceedings.mlr.press/v48/louizos16.pdf
UR  - https://proceedings.mlr.press/v48/louizos16.html
AB  - We introduce a variational Bayesian neural network where the parameters are governed via a probability distribution on random matrices. Specifically, we employ a matrix variate Gaussian (Gupta & Nagar ’99) parameter posterior distribution where we explicitly model the covariance among the input and output dimensions of each layer. Furthermore, with approximate covariance matrices we can achieve a more efficient way to represent those correlations that is also cheaper than fully factorized parameter posteriors. We further show that with the “local reprarametrization trick" (Kingma & Welling ’15) on this posterior distribution we arrive at a Gaussian Process (Rasmussen ’06) interpretation of the hidden units in each layer and we, similarly with (Gal & Ghahramani ’15), provide connections with deep Gaussian processes. We continue in taking advantage of this duality and incorporate “pseudo-data” (Snelson & Ghahramani ’05) in our model, which in turn allows for more efficient posterior sampling while maintaining the properties of the original model. The validity of the proposed approach is verified through extensive experiments.
ER  -

APA


Louizos, C. & Welling, M.. (2016). Structured and Efficient Variational Deep Learning with Matrix Gaussian Posteriors. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:1708-1716 Available from https://proceedings.mlr.press/v48/louizos16.html.

Structured and Efficient Variational Deep Learning with Matrix Gaussian Posteriors

Abstract

Cite this Paper

Related Material