Online Inference for the Infinite Topic-Cluster Model: Storylines from Streaming Text

Amr Ahmed, Qirong Ho, Choon Hui Teo, Jacob Eisenstein, Alex Smola, Eric Xing
Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, PMLR 15:101-109, 2011.

Abstract

We present the time-dependent topic-cluster model, a hierarchical approach for combining Latent Dirichlet Allocation and clustering via the Recurrent Chinese Restaurant Process. It inherits the advantages of both of its constituents, namely interpretability and concise representation. We show how it can be applied to streaming collections of objects such as real world feeds in a news portal. We provide details of a parallel Sequential Monte Carlo algorithm to perform inference in the resulting graphical model which scales to hundred of thousands of documents.

Cite this Paper


BibTeX
@InProceedings{pmlr-v15-ahmed11a, title = {Online Inference for the Infinite Topic-Cluster Model: Storylines from Streaming Text}, author = {Ahmed, Amr and Ho, Qirong and Teo, Choon Hui and Eisenstein, Jacob and Smola, Alex and Xing, Eric}, booktitle = {Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics}, pages = {101--109}, year = {2011}, editor = {Gordon, Geoffrey and Dunson, David and Dudík, Miroslav}, volume = {15}, series = {Proceedings of Machine Learning Research}, address = {Fort Lauderdale, FL, USA}, month = {11--13 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v15/ahmed11a/ahmed11a.pdf}, url = {https://proceedings.mlr.press/v15/ahmed11a.html}, abstract = {We present the time-dependent topic-cluster model, a hierarchical approach for combining Latent Dirichlet Allocation and clustering via the Recurrent Chinese Restaurant Process. It inherits the advantages of both of its constituents, namely interpretability and concise representation. We show how it can be applied to streaming collections of objects such as real world feeds in a news portal. We provide details of a parallel Sequential Monte Carlo algorithm to perform inference in the resulting graphical model which scales to hundred of thousands of documents.} }
Endnote
%0 Conference Paper %T Online Inference for the Infinite Topic-Cluster Model: Storylines from Streaming Text %A Amr Ahmed %A Qirong Ho %A Choon Hui Teo %A Jacob Eisenstein %A Alex Smola %A Eric Xing %B Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2011 %E Geoffrey Gordon %E David Dunson %E Miroslav Dudík %F pmlr-v15-ahmed11a %I PMLR %P 101--109 %U https://proceedings.mlr.press/v15/ahmed11a.html %V 15 %X We present the time-dependent topic-cluster model, a hierarchical approach for combining Latent Dirichlet Allocation and clustering via the Recurrent Chinese Restaurant Process. It inherits the advantages of both of its constituents, namely interpretability and concise representation. We show how it can be applied to streaming collections of objects such as real world feeds in a news portal. We provide details of a parallel Sequential Monte Carlo algorithm to perform inference in the resulting graphical model which scales to hundred of thousands of documents.
RIS
TY - CPAPER TI - Online Inference for the Infinite Topic-Cluster Model: Storylines from Streaming Text AU - Amr Ahmed AU - Qirong Ho AU - Choon Hui Teo AU - Jacob Eisenstein AU - Alex Smola AU - Eric Xing BT - Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics DA - 2011/06/14 ED - Geoffrey Gordon ED - David Dunson ED - Miroslav Dudík ID - pmlr-v15-ahmed11a PB - PMLR DP - Proceedings of Machine Learning Research VL - 15 SP - 101 EP - 109 L1 - http://proceedings.mlr.press/v15/ahmed11a/ahmed11a.pdf UR - https://proceedings.mlr.press/v15/ahmed11a.html AB - We present the time-dependent topic-cluster model, a hierarchical approach for combining Latent Dirichlet Allocation and clustering via the Recurrent Chinese Restaurant Process. It inherits the advantages of both of its constituents, namely interpretability and concise representation. We show how it can be applied to streaming collections of objects such as real world feeds in a news portal. We provide details of a parallel Sequential Monte Carlo algorithm to perform inference in the resulting graphical model which scales to hundred of thousands of documents. ER -
APA
Ahmed, A., Ho, Q., Teo, C.H., Eisenstein, J., Smola, A. & Xing, E.. (2011). Online Inference for the Infinite Topic-Cluster Model: Storylines from Streaming Text. Proceedings of the Fourteenth International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 15:101-109 Available from https://proceedings.mlr.press/v15/ahmed11a.html.

Related Material