Bayesian Nonparametric Multilevel Clustering with Group-Level Contexts

Tien Vu Nguyen, Dinh Phung, Xuanlong Nguyen, Swetha Venkatesh, Hung Bui
Proceedings of the 31st International Conference on Machine Learning, PMLR 32(1):288-296, 2014.

Abstract

We present a Bayesian nonparametric framework for multilevel clustering which utilizes group-level context information to simultaneously discover low-dimensional structures of the group contents and partitions groups into clusters. Using the Dirichlet process as the building block, our model constructs a product base-measure with a nested structure to accommodate content and context observations at multiple levels. The proposed model possesses properties that link the nested Dirichlet processes (nDP) and the Dirichlet process mixture models (DPM) in an interesting way: integrating out all contents results in the DPM over contexts, whereas integrating out group-specific contexts results in the nDP mixture over content variables. We provide a Polya-urn view of the model and an efficient collapsed Gibbs inference procedure. Extensive experiments on real-world datasets demonstrate the advantage of utilizing context information via our model in both text and image domains.

Cite this Paper


BibTeX
@InProceedings{pmlr-v32-nguyenb14, title = {Bayesian Nonparametric Multilevel Clustering with Group-Level Contexts}, author = {Nguyen, Tien Vu and Phung, Dinh and Nguyen, Xuanlong and Venkatesh, Swetha and Bui, Hung}, booktitle = {Proceedings of the 31st International Conference on Machine Learning}, pages = {288--296}, year = {2014}, editor = {Xing, Eric P. and Jebara, Tony}, volume = {32}, number = {1}, series = {Proceedings of Machine Learning Research}, address = {Bejing, China}, month = {22--24 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v32/nguyenb14.pdf}, url = {https://proceedings.mlr.press/v32/nguyenb14.html}, abstract = {We present a Bayesian nonparametric framework for multilevel clustering which utilizes group-level context information to simultaneously discover low-dimensional structures of the group contents and partitions groups into clusters. Using the Dirichlet process as the building block, our model constructs a product base-measure with a nested structure to accommodate content and context observations at multiple levels. The proposed model possesses properties that link the nested Dirichlet processes (nDP) and the Dirichlet process mixture models (DPM) in an interesting way: integrating out all contents results in the DPM over contexts, whereas integrating out group-specific contexts results in the nDP mixture over content variables. We provide a Polya-urn view of the model and an efficient collapsed Gibbs inference procedure. Extensive experiments on real-world datasets demonstrate the advantage of utilizing context information via our model in both text and image domains.} }
Endnote
%0 Conference Paper %T Bayesian Nonparametric Multilevel Clustering with Group-Level Contexts %A Tien Vu Nguyen %A Dinh Phung %A Xuanlong Nguyen %A Swetha Venkatesh %A Hung Bui %B Proceedings of the 31st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2014 %E Eric P. Xing %E Tony Jebara %F pmlr-v32-nguyenb14 %I PMLR %P 288--296 %U https://proceedings.mlr.press/v32/nguyenb14.html %V 32 %N 1 %X We present a Bayesian nonparametric framework for multilevel clustering which utilizes group-level context information to simultaneously discover low-dimensional structures of the group contents and partitions groups into clusters. Using the Dirichlet process as the building block, our model constructs a product base-measure with a nested structure to accommodate content and context observations at multiple levels. The proposed model possesses properties that link the nested Dirichlet processes (nDP) and the Dirichlet process mixture models (DPM) in an interesting way: integrating out all contents results in the DPM over contexts, whereas integrating out group-specific contexts results in the nDP mixture over content variables. We provide a Polya-urn view of the model and an efficient collapsed Gibbs inference procedure. Extensive experiments on real-world datasets demonstrate the advantage of utilizing context information via our model in both text and image domains.
RIS
TY - CPAPER TI - Bayesian Nonparametric Multilevel Clustering with Group-Level Contexts AU - Tien Vu Nguyen AU - Dinh Phung AU - Xuanlong Nguyen AU - Swetha Venkatesh AU - Hung Bui BT - Proceedings of the 31st International Conference on Machine Learning DA - 2014/01/27 ED - Eric P. Xing ED - Tony Jebara ID - pmlr-v32-nguyenb14 PB - PMLR DP - Proceedings of Machine Learning Research VL - 32 IS - 1 SP - 288 EP - 296 L1 - http://proceedings.mlr.press/v32/nguyenb14.pdf UR - https://proceedings.mlr.press/v32/nguyenb14.html AB - We present a Bayesian nonparametric framework for multilevel clustering which utilizes group-level context information to simultaneously discover low-dimensional structures of the group contents and partitions groups into clusters. Using the Dirichlet process as the building block, our model constructs a product base-measure with a nested structure to accommodate content and context observations at multiple levels. The proposed model possesses properties that link the nested Dirichlet processes (nDP) and the Dirichlet process mixture models (DPM) in an interesting way: integrating out all contents results in the DPM over contexts, whereas integrating out group-specific contexts results in the nDP mixture over content variables. We provide a Polya-urn view of the model and an efficient collapsed Gibbs inference procedure. Extensive experiments on real-world datasets demonstrate the advantage of utilizing context information via our model in both text and image domains. ER -
APA
Nguyen, T.V., Phung, D., Nguyen, X., Venkatesh, S. & Bui, H.. (2014). Bayesian Nonparametric Multilevel Clustering with Group-Level Contexts. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(1):288-296 Available from https://proceedings.mlr.press/v32/nguyenb14.html.

Related Material