Nested Chinese Restaurant Franchise Process:  Applications to User Tracking and Document Modeling

Amr Ahmed; Liangjie Hong; Alexander Smola

Nested Chinese Restaurant Franchise Process: Applications to User Tracking and Document Modeling

Amr Ahmed, Liangjie Hong, Alexander Smola

Proceedings of the 30th International Conference on Machine Learning, PMLR 28(3):1426-1434, 2013.

Abstract

Much natural data is hierarchical in nature. Moreover, this hierarchy is often shared between different instances. We introduce the nested Chinese Restaurant Franchise Process as a means to obtain both hierarchical tree-structured representations for objects, akin to (but more general than) the nested Chinese Restaurant Process while sharing their structure akin to the Hierarchical Dirichlet Process. Moreover, by decoupling the \emphstructure generating part of the process from the components responsible for the observations, we are able to apply the same statistical approach to a variety of user generated data. In particular, we model the joint distribution of microblogs and locations for Twitter for users. This leads to a 40% reduction in location uncertainty relative to the best previously published results. Moreover, we model documents from the NIPS papers dataset, obtaining excellent perplexity relative to (hierarchical) Pachinko allocation and LDA.

Cite this Paper

BibTeX


@InProceedings{pmlr-v28-ahmed13,
  title = 	 {Nested Chinese Restaurant Franchise Process:  Applications to User Tracking and Document Modeling},
  author = 	 {Ahmed, Amr and Hong, Liangjie and Smola, Alexander},
  booktitle = 	 {Proceedings of the 30th International Conference on Machine Learning},
  pages = 	 {1426--1434},
  year = 	 {2013},
  editor = 	 {Dasgupta, Sanjoy and McAllester, David},
  volume = 	 {28},
  number =       {3},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Atlanta, Georgia, USA},
  month = 	 {17--19 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v28/ahmed13.pdf},
  url = 	 {https://proceedings.mlr.press/v28/ahmed13.html},
  abstract = 	 {Much natural data is hierarchical in nature. Moreover, this hierarchy  is often shared between different instances. We introduce the  nested Chinese Restaurant Franchise Process as a means to obtain both  hierarchical tree-structured representations for objects, akin to (but more general than) the nested Chinese Restaurant Process while sharing their structure akin  to the Hierarchical Dirichlet Process.     Moreover, by decoupling the \emphstructure generating part of the  process from the components responsible for the observations, we are  able to apply the same statistical approach to a variety of user  generated data. In particular, we model the joint distribution of  microblogs and locations for Twitter for users. This leads to a 40%  reduction in location uncertainty relative to the best previously  published results. Moreover, we model documents from the NIPS papers  dataset, obtaining excellent perplexity relative to (hierarchical)  Pachinko allocation and LDA.}
}

Endnote

%0 Conference Paper
%T Nested Chinese Restaurant Franchise Process:  Applications to User Tracking and Document Modeling
%A Amr Ahmed
%A Liangjie Hong
%A Alexander Smola
%B Proceedings of the 30th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Sanjoy Dasgupta
%E David McAllester	
%F pmlr-v28-ahmed13
%I PMLR
%P 1426--1434
%U https://proceedings.mlr.press/v28/ahmed13.html
%V 28
%N 3
%X Much natural data is hierarchical in nature. Moreover, this hierarchy  is often shared between different instances. We introduce the  nested Chinese Restaurant Franchise Process as a means to obtain both  hierarchical tree-structured representations for objects, akin to (but more general than) the nested Chinese Restaurant Process while sharing their structure akin  to the Hierarchical Dirichlet Process.     Moreover, by decoupling the \emphstructure generating part of the  process from the components responsible for the observations, we are  able to apply the same statistical approach to a variety of user  generated data. In particular, we model the joint distribution of  microblogs and locations for Twitter for users. This leads to a 40%  reduction in location uncertainty relative to the best previously  published results. Moreover, we model documents from the NIPS papers  dataset, obtaining excellent perplexity relative to (hierarchical)  Pachinko allocation and LDA.

RIS


TY  - CPAPER
TI  - Nested Chinese Restaurant Franchise Process:  Applications to User Tracking and Document Modeling
AU  - Amr Ahmed
AU  - Liangjie Hong
AU  - Alexander Smola
BT  - Proceedings of the 30th International Conference on Machine Learning
DA  - 2013/05/26
ED  - Sanjoy Dasgupta
ED  - David McAllester	
ID  - pmlr-v28-ahmed13
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 28
IS  - 3
SP  - 1426
EP  - 1434
L1  - http://proceedings.mlr.press/v28/ahmed13.pdf
UR  - https://proceedings.mlr.press/v28/ahmed13.html
AB  - Much natural data is hierarchical in nature. Moreover, this hierarchy  is often shared between different instances. We introduce the  nested Chinese Restaurant Franchise Process as a means to obtain both  hierarchical tree-structured representations for objects, akin to (but more general than) the nested Chinese Restaurant Process while sharing their structure akin  to the Hierarchical Dirichlet Process.     Moreover, by decoupling the \emphstructure generating part of the  process from the components responsible for the observations, we are  able to apply the same statistical approach to a variety of user  generated data. In particular, we model the joint distribution of  microblogs and locations for Twitter for users. This leads to a 40%  reduction in location uncertainty relative to the best previously  published results. Moreover, we model documents from the NIPS papers  dataset, obtaining excellent perplexity relative to (hierarchical)  Pachinko allocation and LDA.
ER  -

APA


Ahmed, A., Hong, L. & Smola, A.. (2013). Nested Chinese Restaurant Franchise Process:  Applications to User Tracking and Document Modeling. Proceedings of the 30th International Conference on Machine Learning, in Proceedings of Machine Learning Research 28(3):1426-1434 Available from https://proceedings.mlr.press/v28/ahmed13.html.

Nested Chinese Restaurant Franchise Process: Applications to User Tracking and Document Modeling

Abstract

Cite this Paper

Related Material