A Practical Algorithm for Topic Modeling with Provable Guarantees

Sanjeev Arora, Rong Ge, Yonatan Halpern, David Mimno, Ankur Moitra, David Sontag, Yichen Wu, Michael Zhu
; Proceedings of the 30th International Conference on Machine Learning, PMLR 28(2):280-288, 2013.

Abstract

Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model learning have been based on a maximum likelihood objective. Efficient algorithms exist that attempt to approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds, but these algorithms are not practical because they are inefficient and not robust to violations of model assumptions. In this paper we present an algorithm for learning topic models that is both provable and practical. The algorithm produces results comparable to the best MCMC implementations while running orders of magnitude faster.

Cite this Paper


BibTeX
@InProceedings{pmlr-v28-arora13, title = {A Practical Algorithm for Topic Modeling with Provable Guarantees}, author = {Sanjeev Arora and Rong Ge and Yonatan Halpern and David Mimno and Ankur Moitra and David Sontag and Yichen Wu and Michael Zhu}, booktitle = {Proceedings of the 30th International Conference on Machine Learning}, pages = {280--288}, year = {2013}, editor = {Sanjoy Dasgupta and David McAllester}, volume = {28}, number = {2}, series = {Proceedings of Machine Learning Research}, address = {Atlanta, Georgia, USA}, month = {17--19 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v28/arora13.pdf}, url = {http://proceedings.mlr.press/v28/arora13.html}, abstract = {Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model learning have been based on a maximum likelihood objective. Efficient algorithms exist that attempt to approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds, but these algorithms are not practical because they are inefficient and not robust to violations of model assumptions. In this paper we present an algorithm for learning topic models that is both provable and practical. The algorithm produces results comparable to the best MCMC implementations while running orders of magnitude faster.} }
Endnote
%0 Conference Paper %T A Practical Algorithm for Topic Modeling with Provable Guarantees %A Sanjeev Arora %A Rong Ge %A Yonatan Halpern %A David Mimno %A Ankur Moitra %A David Sontag %A Yichen Wu %A Michael Zhu %B Proceedings of the 30th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2013 %E Sanjoy Dasgupta %E David McAllester %F pmlr-v28-arora13 %I PMLR %J Proceedings of Machine Learning Research %P 280--288 %U http://proceedings.mlr.press %V 28 %N 2 %W PMLR %X Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model learning have been based on a maximum likelihood objective. Efficient algorithms exist that attempt to approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds, but these algorithms are not practical because they are inefficient and not robust to violations of model assumptions. In this paper we present an algorithm for learning topic models that is both provable and practical. The algorithm produces results comparable to the best MCMC implementations while running orders of magnitude faster.
RIS
TY - CPAPER TI - A Practical Algorithm for Topic Modeling with Provable Guarantees AU - Sanjeev Arora AU - Rong Ge AU - Yonatan Halpern AU - David Mimno AU - Ankur Moitra AU - David Sontag AU - Yichen Wu AU - Michael Zhu BT - Proceedings of the 30th International Conference on Machine Learning PY - 2013/02/13 DA - 2013/02/13 ED - Sanjoy Dasgupta ED - David McAllester ID - pmlr-v28-arora13 PB - PMLR SP - 280 DP - PMLR EP - 288 L1 - http://proceedings.mlr.press/v28/arora13.pdf UR - http://proceedings.mlr.press/v28/arora13.html AB - Topic models provide a useful method for dimensionality reduction and exploratory data analysis in large text corpora. Most approaches to topic model learning have been based on a maximum likelihood objective. Efficient algorithms exist that attempt to approximate this objective, but they have no provable guarantees. Recently, algorithms have been introduced that provide provable bounds, but these algorithms are not practical because they are inefficient and not robust to violations of model assumptions. In this paper we present an algorithm for learning topic models that is both provable and practical. The algorithm produces results comparable to the best MCMC implementations while running orders of magnitude faster. ER -
APA
Arora, S., Ge, R., Halpern, Y., Mimno, D., Moitra, A., Sontag, D., Wu, Y. & Zhu, M.. (2013). A Practical Algorithm for Topic Modeling with Provable Guarantees. Proceedings of the 30th International Conference on Machine Learning, in PMLR 28(2):280-288

Related Material