Learning Sigmoid Belief Networks via Monte Carlo Expectation Maximization

Zhao Song; Ricardo Henao; David Carlson; Lawrence Carin

Learning Sigmoid Belief Networks via Monte Carlo Expectation Maximization

Zhao Song, Ricardo Henao, David Carlson, Lawrence Carin

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:1347-1355, 2016.

Abstract

Belief networks are commonly used generative models of data, but require expensive posterior estimation to train and test the model. Learning typically proceeds by posterior sampling, variational approximations, or recognition networks, combined with stochastic optimization. We propose using an online Monte Carlo expectation-maximization (MCEM) algorithm to learn the maximum a posteriori (MAP) estimator of the generative model or optimize the variational lower bound of a recognition network. The E-step in this algorithm requires posterior samples, which are already generated in current learning schema. For the M-step, we augment with Polya-Gamma (PG) random variables to give an analytic updating scheme. We show relationships to standard learning approaches by deriving stochastic gradient ascent in the MCEM framework. We apply the proposed methods to both binary and count data. Experimental results show that MCEM improves the convergence speed and often improves hold-out performance over existing learning methods. Our approach is readily generalized to other recognition networks.

Cite this Paper

BibTeX


@InProceedings{pmlr-v51-song16,
  title = 	 {Learning Sigmoid Belief Networks via Monte Carlo Expectation Maximization},
  author = 	 {Song, Zhao and Henao, Ricardo and Carlson, David and Carin, Lawrence},
  booktitle = 	 {Proceedings of the 19th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1347--1355},
  year = 	 {2016},
  editor = 	 {Gretton, Arthur and Robert, Christian C.},
  volume = 	 {51},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Cadiz, Spain},
  month = 	 {09--11 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v51/song16.pdf},
  url = 	 {https://proceedings.mlr.press/v51/song16.html},
  abstract = 	 {Belief networks are commonly used generative models of data, but require expensive posterior estimation to train and test the model. Learning typically proceeds by posterior sampling, variational approximations, or recognition networks, combined with stochastic optimization. We propose using an online Monte Carlo expectation-maximization (MCEM) algorithm to learn the maximum a posteriori (MAP) estimator of the generative model or optimize the variational lower bound of a recognition network. The E-step in this algorithm requires posterior samples, which are already generated in current learning schema. For the M-step, we augment with Polya-Gamma (PG) random variables to give an analytic updating scheme. We show relationships to standard learning approaches by deriving stochastic gradient ascent in the MCEM framework. We apply the proposed methods to both binary and count data. Experimental results show that MCEM improves the convergence speed and often improves hold-out performance over existing learning methods. Our approach is readily generalized to other recognition networks.}
}

Endnote

%0 Conference Paper
%T Learning Sigmoid Belief Networks via Monte Carlo Expectation Maximization
%A Zhao Song
%A Ricardo Henao
%A David Carlson
%A Lawrence Carin
%B Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2016
%E Arthur Gretton
%E Christian C. Robert	
%F pmlr-v51-song16
%I PMLR
%P 1347--1355
%U https://proceedings.mlr.press/v51/song16.html
%V 51
%X Belief networks are commonly used generative models of data, but require expensive posterior estimation to train and test the model. Learning typically proceeds by posterior sampling, variational approximations, or recognition networks, combined with stochastic optimization. We propose using an online Monte Carlo expectation-maximization (MCEM) algorithm to learn the maximum a posteriori (MAP) estimator of the generative model or optimize the variational lower bound of a recognition network. The E-step in this algorithm requires posterior samples, which are already generated in current learning schema. For the M-step, we augment with Polya-Gamma (PG) random variables to give an analytic updating scheme. We show relationships to standard learning approaches by deriving stochastic gradient ascent in the MCEM framework. We apply the proposed methods to both binary and count data. Experimental results show that MCEM improves the convergence speed and often improves hold-out performance over existing learning methods. Our approach is readily generalized to other recognition networks.

RIS


TY  - CPAPER
TI  - Learning Sigmoid Belief Networks via Monte Carlo Expectation Maximization
AU  - Zhao Song
AU  - Ricardo Henao
AU  - David Carlson
AU  - Lawrence Carin
BT  - Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
DA  - 2016/05/02
ED  - Arthur Gretton
ED  - Christian C. Robert	
ID  - pmlr-v51-song16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 51
SP  - 1347
EP  - 1355
L1  - http://proceedings.mlr.press/v51/song16.pdf
UR  - https://proceedings.mlr.press/v51/song16.html
AB  - Belief networks are commonly used generative models of data, but require expensive posterior estimation to train and test the model. Learning typically proceeds by posterior sampling, variational approximations, or recognition networks, combined with stochastic optimization. We propose using an online Monte Carlo expectation-maximization (MCEM) algorithm to learn the maximum a posteriori (MAP) estimator of the generative model or optimize the variational lower bound of a recognition network. The E-step in this algorithm requires posterior samples, which are already generated in current learning schema. For the M-step, we augment with Polya-Gamma (PG) random variables to give an analytic updating scheme. We show relationships to standard learning approaches by deriving stochastic gradient ascent in the MCEM framework. We apply the proposed methods to both binary and count data. Experimental results show that MCEM improves the convergence speed and often improves hold-out performance over existing learning methods. Our approach is readily generalized to other recognition networks.
ER  -

APA


Song, Z., Henao, R., Carlson, D. & Carin, L.. (2016). Learning Sigmoid Belief Networks via Monte Carlo Expectation Maximization. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 51:1347-1355 Available from https://proceedings.mlr.press/v51/song16.html.

Learning Sigmoid Belief Networks via Monte Carlo Expectation Maximization

Abstract

Cite this Paper

Related Material