[edit]
Distributed, partially collapsed MCMC for Bayesian Nonparametrics
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:3685-3695, 2020.
Abstract
Bayesian nonparametric (BNP) models provide elegant methods for discovering underlying latent features within a data set, but inference in such models can be slow. We exploit the fact that completely random measures, which commonly-used models like the Dirichlet process and the beta-Bernoulli process can be expressed using, are decomposable into independent sub-measures. We use this decomposition to partition the latent measure into a finite measure containing only instantiated components, and an infinite measure containing all other components. We then select different inference algorithms for the two components: uncollapsed samplers mix well on the finite measure, while collapsed samplers mix well on the infinite, sparsely occupied tail. The resulting hybrid algorithm can be applied to a wide class of models, and can be easily distributed to allow scalable inference without sacrificing asymptotic convergence guarantees.