Streaming Variational Inference for Bayesian Nonparametric Mixture Models
Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, PMLR 38:968-976, 2015.
In theory, Bayesian nonparametric (BNP) models are well suited to streaming data scenarios due to their ability to adapt model complexity based on the amount of data observed. Unfortunately, such benefits have not been fully realized in practice; existing inference algorithms either are not applicable to streaming applications or are not extensible to nonparametric models. For the special case of Dirichlet processes, streaming inference has been considered. However, there is growing interest in more flexible BNP models, in particular building on the class of normalized random measures (NRMs). We work within this general framework and present a streaming variational inference algorithm for NRM mixture models based on assumed density filtering. Extensions to expectation propagation algorithms are possible in the batch data setting. We demonstrate the efficacy of the algorithm on clustering documents in large, streaming text corpora.