Efficient Collapsed Gibbs Sampling for Latent Dirichlet Allocation

Han Xiao; Thomas Stibor

Efficient Collapsed Gibbs Sampling for Latent Dirichlet Allocation

Han Xiao, Thomas Stibor

Proceedings of 2nd Asian Conference on Machine Learning, PMLR 13:63-78, 2010.

Abstract

Collapsed Gibbs sampling is a frequently applied method to approximate intractable integrals in probabilistic generative models such as latent Dirichlet allocation. This sampling method has however the crucial drawback of high computational complexity, which makes it limited applicable on large data sets. We propose a novel dynamic sampling strategy to significantly improve the efficiency of collapsed Gibbs sampling. The strategy is explored in terms of efficiency, convergence and perplexity. Besides, we present a straight-forward parallelization to further improve the efficiency. Finally, we underpin our proposed improvements with a comparative study on different scale data sets.

Cite this Paper

BibTeX


@InProceedings{pmlr-v13-xiao10a,
  title = 	 {Efficient Collapsed Gibbs Sampling for Latent Dirichlet Allocation},
  author = 	 {Xiao, Han and Stibor, Thomas},
  booktitle = 	 {Proceedings of 2nd Asian Conference on Machine Learning},
  pages = 	 {63--78},
  year = 	 {2010},
  editor = 	 {Sugiyama, Masashi and Yang, Qiang},
  volume = 	 {13},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Tokyo, Japan},
  month = 	 {08--10 Nov},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v13/xiao10a/xiao10a.pdf},
  url = 	 {https://proceedings.mlr.press/v13/xiao10a.html},
  abstract = 	 {Collapsed Gibbs sampling is a frequently applied method to approximate intractable integrals in probabilistic generative models such as latent Dirichlet allocation. This sampling method has however the crucial drawback of high computational complexity, which makes it limited applicable on large data sets. We propose a novel dynamic sampling strategy to significantly improve the efficiency of collapsed Gibbs sampling. The strategy is explored in terms of efficiency, convergence and perplexity. Besides, we present a straight-forward parallelization to further improve the efficiency. Finally, we underpin our proposed improvements with a comparative study on different scale data sets.}
}

Endnote

%0 Conference Paper
%T Efficient Collapsed Gibbs Sampling for Latent Dirichlet Allocation
%A Han Xiao
%A Thomas Stibor
%B Proceedings of 2nd Asian Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2010
%E Masashi Sugiyama
%E Qiang Yang	
%F pmlr-v13-xiao10a
%I PMLR
%P 63--78
%U https://proceedings.mlr.press/v13/xiao10a.html
%V 13
%X Collapsed Gibbs sampling is a frequently applied method to approximate intractable integrals in probabilistic generative models such as latent Dirichlet allocation. This sampling method has however the crucial drawback of high computational complexity, which makes it limited applicable on large data sets. We propose a novel dynamic sampling strategy to significantly improve the efficiency of collapsed Gibbs sampling. The strategy is explored in terms of efficiency, convergence and perplexity. Besides, we present a straight-forward parallelization to further improve the efficiency. Finally, we underpin our proposed improvements with a comparative study on different scale data sets.

RIS


TY  - CPAPER
TI  - Efficient Collapsed Gibbs Sampling for Latent Dirichlet Allocation
AU  - Han Xiao
AU  - Thomas Stibor
BT  - Proceedings of 2nd Asian Conference on Machine Learning
DA  - 2010/10/31
ED  - Masashi Sugiyama
ED  - Qiang Yang	
ID  - pmlr-v13-xiao10a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 13
SP  - 63
EP  - 78
L1  - http://proceedings.mlr.press/v13/xiao10a/xiao10a.pdf
UR  - https://proceedings.mlr.press/v13/xiao10a.html
AB  - Collapsed Gibbs sampling is a frequently applied method to approximate intractable integrals in probabilistic generative models such as latent Dirichlet allocation. This sampling method has however the crucial drawback of high computational complexity, which makes it limited applicable on large data sets. We propose a novel dynamic sampling strategy to significantly improve the efficiency of collapsed Gibbs sampling. The strategy is explored in terms of efficiency, convergence and perplexity. Besides, we present a straight-forward parallelization to further improve the efficiency. Finally, we underpin our proposed improvements with a comparative study on different scale data sets.
ER  -

APA


Xiao, H. & Stibor, T.. (2010). Efficient Collapsed Gibbs Sampling for Latent Dirichlet Allocation. Proceedings of 2nd Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 13:63-78 Available from https://proceedings.mlr.press/v13/xiao10a.html.

Related Material

Download PDF