Marginalized Stochastic Natural Gradients for Black-Box Variational Inference

Geng Ji, Debora Sujono, Erik B Sudderth
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:4870-4881, 2021.

Abstract

Black-box variational inference algorithms use stochastic sampling to analyze diverse statistical models, like those expressed in probabilistic programming languages, without model-specific derivations. While the popular score-function estimator computes unbiased gradient estimates, its variance is often unacceptably large, especially in models with discrete latent variables. We propose a stochastic natural gradient estimator that is as broadly applicable and unbiased, but improves efficiency by exploiting the curvature of the variational bound, and provably reduces variance by marginalizing discrete latent variables. Our marginalized stochastic natural gradients have intriguing connections to classic coordinate ascent variational inference, but allow parallel updates of variational parameters, and provide superior convergence guarantees relative to naive Monte Carlo approximations. We integrate our method with the probabilistic programming language Pyro and evaluate real-world models of documents, images, networks, and crowd-sourcing. Compared to score-function estimators, we require far fewer Monte Carlo samples and consistently convergence orders of magnitude faster.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-ji21b, title = {Marginalized Stochastic Natural Gradients for Black-Box Variational Inference}, author = {Ji, Geng and Sujono, Debora and Sudderth, Erik B}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {4870--4881}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/ji21b/ji21b.pdf}, url = {https://proceedings.mlr.press/v139/ji21b.html}, abstract = {Black-box variational inference algorithms use stochastic sampling to analyze diverse statistical models, like those expressed in probabilistic programming languages, without model-specific derivations. While the popular score-function estimator computes unbiased gradient estimates, its variance is often unacceptably large, especially in models with discrete latent variables. We propose a stochastic natural gradient estimator that is as broadly applicable and unbiased, but improves efficiency by exploiting the curvature of the variational bound, and provably reduces variance by marginalizing discrete latent variables. Our marginalized stochastic natural gradients have intriguing connections to classic coordinate ascent variational inference, but allow parallel updates of variational parameters, and provide superior convergence guarantees relative to naive Monte Carlo approximations. We integrate our method with the probabilistic programming language Pyro and evaluate real-world models of documents, images, networks, and crowd-sourcing. Compared to score-function estimators, we require far fewer Monte Carlo samples and consistently convergence orders of magnitude faster.} }
Endnote
%0 Conference Paper %T Marginalized Stochastic Natural Gradients for Black-Box Variational Inference %A Geng Ji %A Debora Sujono %A Erik B Sudderth %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-ji21b %I PMLR %P 4870--4881 %U https://proceedings.mlr.press/v139/ji21b.html %V 139 %X Black-box variational inference algorithms use stochastic sampling to analyze diverse statistical models, like those expressed in probabilistic programming languages, without model-specific derivations. While the popular score-function estimator computes unbiased gradient estimates, its variance is often unacceptably large, especially in models with discrete latent variables. We propose a stochastic natural gradient estimator that is as broadly applicable and unbiased, but improves efficiency by exploiting the curvature of the variational bound, and provably reduces variance by marginalizing discrete latent variables. Our marginalized stochastic natural gradients have intriguing connections to classic coordinate ascent variational inference, but allow parallel updates of variational parameters, and provide superior convergence guarantees relative to naive Monte Carlo approximations. We integrate our method with the probabilistic programming language Pyro and evaluate real-world models of documents, images, networks, and crowd-sourcing. Compared to score-function estimators, we require far fewer Monte Carlo samples and consistently convergence orders of magnitude faster.
APA
Ji, G., Sujono, D. & Sudderth, E.B.. (2021). Marginalized Stochastic Natural Gradients for Black-Box Variational Inference. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:4870-4881 Available from https://proceedings.mlr.press/v139/ji21b.html.

Related Material