Amortized Variational Inference: When and Why?

Charles C. Margossian, David M. Blei
Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, PMLR 244:2434-2449, 2024.

Abstract

In a probabilistic latent variable model, factorized (or mean-field) variational inference (F-VI) fits a separate parametric distribution for each latent variable. Amortized variational inference (A-VI) instead learns a common inference function, which maps each observation to its corresponding latent variable’s approximate posterior. Typically, A-VI is used as a cog in the training of variational autoencoders, however it stands to reason that A-VI could also be used as a general alternative to F-VI. In this paper we study when and why A-VI can be used for approximate Bayesian inference. We derive conditions on a latent variable model which are necessary, sufficient, and verifiable under which A-VI can attain F-VI’s optimal solution, thereby closing the amortization gap. We prove these conditions are uniquely verified by simple hierarchical models, a broad class that encompasses many models in machine learning. We then show, on a broader class of models, how to expand the domain of AVI’s inference function to improve its solution, and we provide examples, e.g. hidden Markov models, where the amortization gap cannot be closed.

Cite this Paper


BibTeX
@InProceedings{pmlr-v244-margossian24a, title = {Amortized Variational Inference: When and Why?}, author = {Margossian, Charles C. and Blei, David M.}, booktitle = {Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence}, pages = {2434--2449}, year = {2024}, editor = {Kiyavash, Negar and Mooij, Joris M.}, volume = {244}, series = {Proceedings of Machine Learning Research}, month = {15--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v244/main/assets/margossian24a/margossian24a.pdf}, url = {https://proceedings.mlr.press/v244/margossian24a.html}, abstract = {In a probabilistic latent variable model, factorized (or mean-field) variational inference (F-VI) fits a separate parametric distribution for each latent variable. Amortized variational inference (A-VI) instead learns a common inference function, which maps each observation to its corresponding latent variable’s approximate posterior. Typically, A-VI is used as a cog in the training of variational autoencoders, however it stands to reason that A-VI could also be used as a general alternative to F-VI. In this paper we study when and why A-VI can be used for approximate Bayesian inference. We derive conditions on a latent variable model which are necessary, sufficient, and verifiable under which A-VI can attain F-VI’s optimal solution, thereby closing the amortization gap. We prove these conditions are uniquely verified by simple hierarchical models, a broad class that encompasses many models in machine learning. We then show, on a broader class of models, how to expand the domain of AVI’s inference function to improve its solution, and we provide examples, e.g. hidden Markov models, where the amortization gap cannot be closed.} }
Endnote
%0 Conference Paper %T Amortized Variational Inference: When and Why? %A Charles C. Margossian %A David M. Blei %B Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2024 %E Negar Kiyavash %E Joris M. Mooij %F pmlr-v244-margossian24a %I PMLR %P 2434--2449 %U https://proceedings.mlr.press/v244/margossian24a.html %V 244 %X In a probabilistic latent variable model, factorized (or mean-field) variational inference (F-VI) fits a separate parametric distribution for each latent variable. Amortized variational inference (A-VI) instead learns a common inference function, which maps each observation to its corresponding latent variable’s approximate posterior. Typically, A-VI is used as a cog in the training of variational autoencoders, however it stands to reason that A-VI could also be used as a general alternative to F-VI. In this paper we study when and why A-VI can be used for approximate Bayesian inference. We derive conditions on a latent variable model which are necessary, sufficient, and verifiable under which A-VI can attain F-VI’s optimal solution, thereby closing the amortization gap. We prove these conditions are uniquely verified by simple hierarchical models, a broad class that encompasses many models in machine learning. We then show, on a broader class of models, how to expand the domain of AVI’s inference function to improve its solution, and we provide examples, e.g. hidden Markov models, where the amortization gap cannot be closed.
APA
Margossian, C.C. & Blei, D.M.. (2024). Amortized Variational Inference: When and Why?. Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 244:2434-2449 Available from https://proceedings.mlr.press/v244/margossian24a.html.

Related Material