Reliable and Scalable Variational Inference for the Hierarchical Dirichlet Process

Michael Hughes, Dae Il Kim, Erik Sudderth
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, PMLR 38:370-378, 2015.

Abstract

We introduce a new variational inference objective for hierarchical Dirichlet process admixture models. Our approach provides novel and scalable algorithms for learning nonparametric topic models of text documents and Gaussian admixture models of image patches. Improving on the point estimates of topic probabilities used in previous work, we define full variational posteriors for all latent variables and optimize parameters via a novel surrogate likelihood bound. We show that this approach has crucial advantages for data-driven learning of the number of topics. Via merge and delete moves that remove redundant or irrelevant topics, we learn compact and interpretable models with less computation. Scaling to millions of documents is possible using stochastic or memoized variational updates.

Cite this Paper


BibTeX
@InProceedings{pmlr-v38-hughes15, title = {{Reliable and Scalable Variational Inference for the Hierarchical Dirichlet Process}}, author = {Michael Hughes and Dae Il Kim and Erik Sudderth}, booktitle = {Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics}, pages = {370--378}, year = {2015}, editor = {Guy Lebanon and S. V. N. Vishwanathan}, volume = {38}, series = {Proceedings of Machine Learning Research}, address = {San Diego, California, USA}, month = {09--12 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v38/hughes15.pdf}, url = { http://proceedings.mlr.press/v38/hughes15.html }, abstract = {We introduce a new variational inference objective for hierarchical Dirichlet process admixture models. Our approach provides novel and scalable algorithms for learning nonparametric topic models of text documents and Gaussian admixture models of image patches. Improving on the point estimates of topic probabilities used in previous work, we define full variational posteriors for all latent variables and optimize parameters via a novel surrogate likelihood bound. We show that this approach has crucial advantages for data-driven learning of the number of topics. Via merge and delete moves that remove redundant or irrelevant topics, we learn compact and interpretable models with less computation. Scaling to millions of documents is possible using stochastic or memoized variational updates.} }
Endnote
%0 Conference Paper %T Reliable and Scalable Variational Inference for the Hierarchical Dirichlet Process %A Michael Hughes %A Dae Il Kim %A Erik Sudderth %B Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2015 %E Guy Lebanon %E S. V. N. Vishwanathan %F pmlr-v38-hughes15 %I PMLR %P 370--378 %U http://proceedings.mlr.press/v38/hughes15.html %V 38 %X We introduce a new variational inference objective for hierarchical Dirichlet process admixture models. Our approach provides novel and scalable algorithms for learning nonparametric topic models of text documents and Gaussian admixture models of image patches. Improving on the point estimates of topic probabilities used in previous work, we define full variational posteriors for all latent variables and optimize parameters via a novel surrogate likelihood bound. We show that this approach has crucial advantages for data-driven learning of the number of topics. Via merge and delete moves that remove redundant or irrelevant topics, we learn compact and interpretable models with less computation. Scaling to millions of documents is possible using stochastic or memoized variational updates.
RIS
TY - CPAPER TI - Reliable and Scalable Variational Inference for the Hierarchical Dirichlet Process AU - Michael Hughes AU - Dae Il Kim AU - Erik Sudderth BT - Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics DA - 2015/02/21 ED - Guy Lebanon ED - S. V. N. Vishwanathan ID - pmlr-v38-hughes15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 38 SP - 370 EP - 378 L1 - http://proceedings.mlr.press/v38/hughes15.pdf UR - http://proceedings.mlr.press/v38/hughes15.html AB - We introduce a new variational inference objective for hierarchical Dirichlet process admixture models. Our approach provides novel and scalable algorithms for learning nonparametric topic models of text documents and Gaussian admixture models of image patches. Improving on the point estimates of topic probabilities used in previous work, we define full variational posteriors for all latent variables and optimize parameters via a novel surrogate likelihood bound. We show that this approach has crucial advantages for data-driven learning of the number of topics. Via merge and delete moves that remove redundant or irrelevant topics, we learn compact and interpretable models with less computation. Scaling to millions of documents is possible using stochastic or memoized variational updates. ER -
APA
Hughes, M., Kim, D.I. & Sudderth, E.. (2015). Reliable and Scalable Variational Inference for the Hierarchical Dirichlet Process. Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 38:370-378 Available from http://proceedings.mlr.press/v38/hughes15.html .

Related Material