Learning Deep Latent Gaussian Models with Markov Chain Monte Carlo

Matthew D. Hoffman
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:1510-1519, 2017.

Abstract

Deep latent Gaussian models are powerful and popular probabilistic models of high-dimensional data. These models are almost always fit using variational expectation-maximization, an approximation to true maximum-marginal-likelihood estimation. In this paper, we propose a different approach: rather than use a variational approximation (which produces biased gradient signals), we use Markov chain Monte Carlo (MCMC, which allows us to trade bias for computation). We find that our MCMC-based approach has several advantages: it yields higher held-out likelihoods, produces sharper images, and does not suffer from the variational overpruning effect. MCMC’s additional computational overhead proves to be significant, but not prohibitive.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-hoffman17a, title = {Learning Deep Latent {G}aussian Models with {M}arkov Chain {M}onte {C}arlo}, author = {Matthew D. Hoffman}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {1510--1519}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/hoffman17a/hoffman17a.pdf}, url = {https://proceedings.mlr.press/v70/hoffman17a.html}, abstract = {Deep latent Gaussian models are powerful and popular probabilistic models of high-dimensional data. These models are almost always fit using variational expectation-maximization, an approximation to true maximum-marginal-likelihood estimation. In this paper, we propose a different approach: rather than use a variational approximation (which produces biased gradient signals), we use Markov chain Monte Carlo (MCMC, which allows us to trade bias for computation). We find that our MCMC-based approach has several advantages: it yields higher held-out likelihoods, produces sharper images, and does not suffer from the variational overpruning effect. MCMC’s additional computational overhead proves to be significant, but not prohibitive.} }
Endnote
%0 Conference Paper %T Learning Deep Latent Gaussian Models with Markov Chain Monte Carlo %A Matthew D. Hoffman %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-hoffman17a %I PMLR %P 1510--1519 %U https://proceedings.mlr.press/v70/hoffman17a.html %V 70 %X Deep latent Gaussian models are powerful and popular probabilistic models of high-dimensional data. These models are almost always fit using variational expectation-maximization, an approximation to true maximum-marginal-likelihood estimation. In this paper, we propose a different approach: rather than use a variational approximation (which produces biased gradient signals), we use Markov chain Monte Carlo (MCMC, which allows us to trade bias for computation). We find that our MCMC-based approach has several advantages: it yields higher held-out likelihoods, produces sharper images, and does not suffer from the variational overpruning effect. MCMC’s additional computational overhead proves to be significant, but not prohibitive.
APA
Hoffman, M.D.. (2017). Learning Deep Latent Gaussian Models with Markov Chain Monte Carlo. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:1510-1519 Available from https://proceedings.mlr.press/v70/hoffman17a.html.

Related Material