Composing Tree Graphical Models with Persistent Homology Features for Clustering Mixed-Type Data

Xiuyan Ni, Novi Quadrianto, Yusu Wang, Chao Chen
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:2622-2631, 2017.

Abstract

Clustering data with both continuous and discrete attributes is a challenging task. Existing methods lack a principled probabilistic formulation. In this paper, we propose a clustering method based on a tree-structured graphical model to describe the generation process of mixed-type data. Our tree-structured model factorized into a product of pairwise interactions, and thus localizes the interaction between feature variables of different types. To provide a robust clustering method based on the tree-model, we adopt a topographical view and compute peaks of the density function and their attractive basins for clustering. Furthermore, we leverage the theory from topology data analysis to adaptively merge trivial peaks into large ones in order to achieve meaningful clusterings. Our method outperforms state-of-the-art methods on mixed-type data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-ni17a, title = {Composing Tree Graphical Models with Persistent Homology Features for Clustering Mixed-Type Data}, author = {Xiuyan Ni and Novi Quadrianto and Yusu Wang and Chao Chen}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {2622--2631}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/ni17a/ni17a.pdf}, url = {https://proceedings.mlr.press/v70/ni17a.html}, abstract = {Clustering data with both continuous and discrete attributes is a challenging task. Existing methods lack a principled probabilistic formulation. In this paper, we propose a clustering method based on a tree-structured graphical model to describe the generation process of mixed-type data. Our tree-structured model factorized into a product of pairwise interactions, and thus localizes the interaction between feature variables of different types. To provide a robust clustering method based on the tree-model, we adopt a topographical view and compute peaks of the density function and their attractive basins for clustering. Furthermore, we leverage the theory from topology data analysis to adaptively merge trivial peaks into large ones in order to achieve meaningful clusterings. Our method outperforms state-of-the-art methods on mixed-type data.} }
Endnote
%0 Conference Paper %T Composing Tree Graphical Models with Persistent Homology Features for Clustering Mixed-Type Data %A Xiuyan Ni %A Novi Quadrianto %A Yusu Wang %A Chao Chen %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-ni17a %I PMLR %P 2622--2631 %U https://proceedings.mlr.press/v70/ni17a.html %V 70 %X Clustering data with both continuous and discrete attributes is a challenging task. Existing methods lack a principled probabilistic formulation. In this paper, we propose a clustering method based on a tree-structured graphical model to describe the generation process of mixed-type data. Our tree-structured model factorized into a product of pairwise interactions, and thus localizes the interaction between feature variables of different types. To provide a robust clustering method based on the tree-model, we adopt a topographical view and compute peaks of the density function and their attractive basins for clustering. Furthermore, we leverage the theory from topology data analysis to adaptively merge trivial peaks into large ones in order to achieve meaningful clusterings. Our method outperforms state-of-the-art methods on mixed-type data.
APA
Ni, X., Quadrianto, N., Wang, Y. & Chen, C.. (2017). Composing Tree Graphical Models with Persistent Homology Features for Clustering Mixed-Type Data. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:2622-2631 Available from https://proceedings.mlr.press/v70/ni17a.html.

Related Material