A Practical Algorithm for Topic Modeling with Provable Guarantees

Sanjeev Arora; Rong Ge; Yonatan Halpern; David Mimno; Ankur Moitra; David Sontag; Yichen Wu; Michael Zhu

A Practical Algorithm for Topic Modeling with Provable Guarantees

Sanjeev Arora, Rong Ge, Yonatan Halpern, David Mimno, Ankur Moitra, David Sontag, Yichen Wu, Michael Zhu

Proceedings of the 30th International Conference on Machine Learning, PMLR 28(2):280-288, 2013.

Abstract

Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model learning have been based on a maximum likelihood objective. Efficient algorithms exist that attempt to approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds, but these algorithms are not practical because they are inefficient and not robust to violations of model assumptions. In this paper we present an algorithm for learning topic models that is both provable and practical. The algorithm produces results comparable to the best MCMC implementations while running orders of magnitude faster.

Cite this Paper

BibTeX


@InProceedings{pmlr-v28-arora13,
  title = 	 {A Practical Algorithm for Topic Modeling with Provable Guarantees},
  author = 	 {Arora, Sanjeev and Ge, Rong and Halpern, Yonatan and Mimno, David and Moitra, Ankur and Sontag, David and Wu, Yichen and Zhu, Michael},
  booktitle = 	 {Proceedings of the 30th International Conference on Machine Learning},
  pages = 	 {280--288},
  year = 	 {2013},
  editor = 	 {Dasgupta, Sanjoy and McAllester, David},
  volume = 	 {28},
  number =       {2},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Atlanta, Georgia, USA},
  month = 	 {17--19 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v28/arora13.pdf},
  url = 	 {https://proceedings.mlr.press/v28/arora13.html},
  abstract = 	 {Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model learning have been based on a maximum likelihood objective. Efficient algorithms exist that attempt to approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds, but these algorithms are not practical because they are inefficient and not robust to violations of model assumptions. In this paper we present an algorithm for learning topic models that is both provable and practical. The algorithm produces results comparable to the best MCMC implementations while running orders of magnitude faster.}
}

Endnote

%0 Conference Paper
%T A Practical Algorithm for Topic Modeling with Provable Guarantees
%A Sanjeev Arora
%A Rong Ge
%A Yonatan Halpern
%A David Mimno
%A Ankur Moitra
%A David Sontag
%A Yichen Wu
%A Michael Zhu
%B Proceedings of the 30th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Sanjoy Dasgupta
%E David McAllester	
%F pmlr-v28-arora13
%I PMLR
%P 280--288
%U https://proceedings.mlr.press/v28/arora13.html
%V 28
%N 2
%X Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model learning have been based on a maximum likelihood objective. Efficient algorithms exist that attempt to approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds, but these algorithms are not practical because they are inefficient and not robust to violations of model assumptions. In this paper we present an algorithm for learning topic models that is both provable and practical. The algorithm produces results comparable to the best MCMC implementations while running orders of magnitude faster.

RIS


TY  - CPAPER
TI  - A Practical Algorithm for Topic Modeling with Provable Guarantees
AU  - Sanjeev Arora
AU  - Rong Ge
AU  - Yonatan Halpern
AU  - David Mimno
AU  - Ankur Moitra
AU  - David Sontag
AU  - Yichen Wu
AU  - Michael Zhu
BT  - Proceedings of the 30th International Conference on Machine Learning
DA  - 2013/05/13
ED  - Sanjoy Dasgupta
ED  - David McAllester	
ID  - pmlr-v28-arora13
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 28
IS  - 2
SP  - 280
EP  - 288
L1  - http://proceedings.mlr.press/v28/arora13.pdf
UR  - https://proceedings.mlr.press/v28/arora13.html
AB  - Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model learning have been based on a maximum likelihood objective. Efficient algorithms exist that attempt to approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds, but these algorithms are not practical because they are inefficient and not robust to violations of model assumptions. In this paper we present an algorithm for learning topic models that is both provable and practical. The algorithm produces results comparable to the best MCMC implementations while running orders of magnitude faster.
ER  -

APA


Arora, S., Ge, R., Halpern, Y., Mimno, D., Moitra, A., Sontag, D., Wu, Y. & Zhu, M.. (2013). A Practical Algorithm for Topic Modeling with Provable Guarantees. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(2):280-288 Available from https://proceedings.mlr.press/v28/arora13.html.

A Practical Algorithm for Topic Modeling with Provable Guarantees

Abstract

Cite this Paper

Related Material