Variational Rejection Sampling

Aditya Grover, Ramki Gummadi, Miguel Lazaro-Gredilla, Dale Schuurmans, Stefano Ermon
Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR 84:823-832, 2018.

Abstract

Learning latent variable models with stochastic variational inference is challenging when the approximate posterior is far from the true posterior, due to high variance in the gradient estimates. We propose a novel rejection sampling step that discards samples from the variational posterior which are assigned low likelihoods by the model. Our approach provides an arbitrarily accurate approximation of the true posterior at the expense of extra computation. Using a new gradient estimator for the resulting unnormalized proposal distribution, we achieve average improvements of 3.71 nats and 0.31 nats over state-of-the-art single-sample and multi-sample alternatives respectively for estimating marginal log-likelihoods using sigmoid belief networks on the MNIST dataset. We show both theoretically and empirically how explicitly rejecting samples, while seemingly challenging to analyze due to the implicit nature of the resulting unnormalized proposal distribution, can have benefits over the competing state-of-the-art alternatives based on importance weighting. We demonstrate the effectiveness of the proposed approach via experiments on synthetic data and a benchmark density estimation task with sigmoid belief networks over the MNIST dataset.

Cite this Paper


BibTeX
@InProceedings{pmlr-v84-grover18a, title = {Variational Rejection Sampling}, author = {Grover, Aditya and Gummadi, Ramki and Lazaro-Gredilla, Miguel and Schuurmans, Dale and Ermon, Stefano}, booktitle = {Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics}, pages = {823--832}, year = {2018}, editor = {Storkey, Amos and Perez-Cruz, Fernando}, volume = {84}, series = {Proceedings of Machine Learning Research}, month = {09--11 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v84/grover18a/grover18a.pdf}, url = {https://proceedings.mlr.press/v84/grover18a.html}, abstract = {Learning latent variable models with stochastic variational inference is challenging when the approximate posterior is far from the true posterior, due to high variance in the gradient estimates. We propose a novel rejection sampling step that discards samples from the variational posterior which are assigned low likelihoods by the model. Our approach provides an arbitrarily accurate approximation of the true posterior at the expense of extra computation. Using a new gradient estimator for the resulting unnormalized proposal distribution, we achieve average improvements of 3.71 nats and 0.31 nats over state-of-the-art single-sample and multi-sample alternatives respectively for estimating marginal log-likelihoods using sigmoid belief networks on the MNIST dataset. We show both theoretically and empirically how explicitly rejecting samples, while seemingly challenging to analyze due to the implicit nature of the resulting unnormalized proposal distribution, can have benefits over the competing state-of-the-art alternatives based on importance weighting. We demonstrate the effectiveness of the proposed approach via experiments on synthetic data and a benchmark density estimation task with sigmoid belief networks over the MNIST dataset.} }
Endnote
%0 Conference Paper %T Variational Rejection Sampling %A Aditya Grover %A Ramki Gummadi %A Miguel Lazaro-Gredilla %A Dale Schuurmans %A Stefano Ermon %B Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2018 %E Amos Storkey %E Fernando Perez-Cruz %F pmlr-v84-grover18a %I PMLR %P 823--832 %U https://proceedings.mlr.press/v84/grover18a.html %V 84 %X Learning latent variable models with stochastic variational inference is challenging when the approximate posterior is far from the true posterior, due to high variance in the gradient estimates. We propose a novel rejection sampling step that discards samples from the variational posterior which are assigned low likelihoods by the model. Our approach provides an arbitrarily accurate approximation of the true posterior at the expense of extra computation. Using a new gradient estimator for the resulting unnormalized proposal distribution, we achieve average improvements of 3.71 nats and 0.31 nats over state-of-the-art single-sample and multi-sample alternatives respectively for estimating marginal log-likelihoods using sigmoid belief networks on the MNIST dataset. We show both theoretically and empirically how explicitly rejecting samples, while seemingly challenging to analyze due to the implicit nature of the resulting unnormalized proposal distribution, can have benefits over the competing state-of-the-art alternatives based on importance weighting. We demonstrate the effectiveness of the proposed approach via experiments on synthetic data and a benchmark density estimation task with sigmoid belief networks over the MNIST dataset.
APA
Grover, A., Gummadi, R., Lazaro-Gredilla, M., Schuurmans, D. & Ermon, S.. (2018). Variational Rejection Sampling. Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 84:823-832 Available from https://proceedings.mlr.press/v84/grover18a.html.

Related Material