Efficient Collapsed Gibbs Sampling for Latent Dirichlet Allocation

[edit]

Han Xiao (Technical University Munich), Thomas Stibor (Technical University Munich) ;
Proceedings of 2nd Asian Conference on Machine Learning, PMLR 13:63-78, 2010.

Abstract

Collapsed Gibbs sampling is a frequently applied method to approximate intractable integrals in probabilistic generative models such as latent Dirichlet allocation. This sampling method has however the crucial drawback of high computational complexity, which makes it limited applicable on large data sets. We propose a novel dynamic sampling strategy to significantly improve the efficiency of collapsed Gibbs sampling. The strategy is explored in terms of efficiency, convergence and perplexity. Besides, we present a straight-forward parallelization to further improve the efficiency. Finally, we underpin our proposed improvements with a comparative study on different scale data sets.

Related Material