Distributed, partially collapsed MCMC for Bayesian Nonparametrics

Kumar Avinava Dubey, Michael Zhang, Eric Xing, Sinead Williamson
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:3685-3695, 2020.

Abstract

Bayesian nonparametric (BNP) models provide elegant methods for discovering underlying latent features within a data set, but inference in such models can be slow. We exploit the fact that completely random measures, which commonly-used models like the Dirichlet process and the beta-Bernoulli process can be expressed using, are decomposable into independent sub-measures. We use this decomposition to partition the latent measure into a finite measure containing only instantiated components, and an infinite measure containing all other components. We then select different inference algorithms for the two components: uncollapsed samplers mix well on the finite measure, while collapsed samplers mix well on the infinite, sparsely occupied tail. The resulting hybrid algorithm can be applied to a wide class of models, and can be easily distributed to allow scalable inference without sacrificing asymptotic convergence guarantees.

Cite this Paper


BibTeX
@InProceedings{pmlr-v108-dubey20a, title = {Distributed, partially collapsed MCMC for Bayesian Nonparametrics}, author = {Dubey, Kumar Avinava and Zhang, Michael and Xing, Eric and Williamson, Sinead}, booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics}, pages = {3685--3695}, year = {2020}, editor = {Silvia Chiappa and Roberto Calandra}, volume = {108}, series = {Proceedings of Machine Learning Research}, month = {26--28 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v108/dubey20a/dubey20a.pdf}, url = { http://proceedings.mlr.press/v108/dubey20a.html }, abstract = {Bayesian nonparametric (BNP) models provide elegant methods for discovering underlying latent features within a data set, but inference in such models can be slow. We exploit the fact that completely random measures, which commonly-used models like the Dirichlet process and the beta-Bernoulli process can be expressed using, are decomposable into independent sub-measures. We use this decomposition to partition the latent measure into a finite measure containing only instantiated components, and an infinite measure containing all other components. We then select different inference algorithms for the two components: uncollapsed samplers mix well on the finite measure, while collapsed samplers mix well on the infinite, sparsely occupied tail. The resulting hybrid algorithm can be applied to a wide class of models, and can be easily distributed to allow scalable inference without sacrificing asymptotic convergence guarantees. } }
Endnote
%0 Conference Paper %T Distributed, partially collapsed MCMC for Bayesian Nonparametrics %A Kumar Avinava Dubey %A Michael Zhang %A Eric Xing %A Sinead Williamson %B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2020 %E Silvia Chiappa %E Roberto Calandra %F pmlr-v108-dubey20a %I PMLR %P 3685--3695 %U http://proceedings.mlr.press/v108/dubey20a.html %V 108 %X Bayesian nonparametric (BNP) models provide elegant methods for discovering underlying latent features within a data set, but inference in such models can be slow. We exploit the fact that completely random measures, which commonly-used models like the Dirichlet process and the beta-Bernoulli process can be expressed using, are decomposable into independent sub-measures. We use this decomposition to partition the latent measure into a finite measure containing only instantiated components, and an infinite measure containing all other components. We then select different inference algorithms for the two components: uncollapsed samplers mix well on the finite measure, while collapsed samplers mix well on the infinite, sparsely occupied tail. The resulting hybrid algorithm can be applied to a wide class of models, and can be easily distributed to allow scalable inference without sacrificing asymptotic convergence guarantees.
APA
Dubey, K.A., Zhang, M., Xing, E. & Williamson, S.. (2020). Distributed, partially collapsed MCMC for Bayesian Nonparametrics. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:3685-3695 Available from http://proceedings.mlr.press/v108/dubey20a.html .

Related Material