Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering

Bo Yang, Xiao Fu, Nicholas D. Sidiropoulos, Mingyi Hong
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:3861-3870, 2017.

Abstract

Most learning approaches treat dimensionality reduction (DR) and clustering separately (i.e., sequentially), but recent research has shown that optimizing the two tasks jointly can substantially improve the performance of both. The premise behind the latter genre is that the data samples are obtained via linear transformation of latent representations that are easy to cluster; but in practice, the transformation from the latent space to the data can be more complicated. In this work, we assume that this transformation is an unknown and possibly nonlinear function. To recover the `clustering-friendly’ latent representations and to better cluster the data, we propose a joint DR and K-means clustering approach in which DR is accomplished via learning a deep neural network (DNN). The motivation is to keep the advantages of jointly optimizing the two tasks, while exploiting the deep neural network’s ability to approximate any nonlinear function. This way, the proposed approach can work well for a broad class of generative models. Towards this end, we carefully design the DNN structure and the associated joint optimization criterion, and propose an effective and scalable algorithm to handle the formulated optimization problem. Experiments using different real datasets are employed to showcase the effectiveness of the proposed approach.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-yang17b, title = {Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering}, author = {Bo Yang and Xiao Fu and Nicholas D. Sidiropoulos and Mingyi Hong}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {3861--3870}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/yang17b/yang17b.pdf}, url = {https://proceedings.mlr.press/v70/yang17b.html}, abstract = {Most learning approaches treat dimensionality reduction (DR) and clustering separately (i.e., sequentially), but recent research has shown that optimizing the two tasks jointly can substantially improve the performance of both. The premise behind the latter genre is that the data samples are obtained via linear transformation of latent representations that are easy to cluster; but in practice, the transformation from the latent space to the data can be more complicated. In this work, we assume that this transformation is an unknown and possibly nonlinear function. To recover the `clustering-friendly’ latent representations and to better cluster the data, we propose a joint DR and K-means clustering approach in which DR is accomplished via learning a deep neural network (DNN). The motivation is to keep the advantages of jointly optimizing the two tasks, while exploiting the deep neural network’s ability to approximate any nonlinear function. This way, the proposed approach can work well for a broad class of generative models. Towards this end, we carefully design the DNN structure and the associated joint optimization criterion, and propose an effective and scalable algorithm to handle the formulated optimization problem. Experiments using different real datasets are employed to showcase the effectiveness of the proposed approach.} }
Endnote
%0 Conference Paper %T Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering %A Bo Yang %A Xiao Fu %A Nicholas D. Sidiropoulos %A Mingyi Hong %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-yang17b %I PMLR %P 3861--3870 %U https://proceedings.mlr.press/v70/yang17b.html %V 70 %X Most learning approaches treat dimensionality reduction (DR) and clustering separately (i.e., sequentially), but recent research has shown that optimizing the two tasks jointly can substantially improve the performance of both. The premise behind the latter genre is that the data samples are obtained via linear transformation of latent representations that are easy to cluster; but in practice, the transformation from the latent space to the data can be more complicated. In this work, we assume that this transformation is an unknown and possibly nonlinear function. To recover the `clustering-friendly’ latent representations and to better cluster the data, we propose a joint DR and K-means clustering approach in which DR is accomplished via learning a deep neural network (DNN). The motivation is to keep the advantages of jointly optimizing the two tasks, while exploiting the deep neural network’s ability to approximate any nonlinear function. This way, the proposed approach can work well for a broad class of generative models. Towards this end, we carefully design the DNN structure and the associated joint optimization criterion, and propose an effective and scalable algorithm to handle the formulated optimization problem. Experiments using different real datasets are employed to showcase the effectiveness of the proposed approach.
APA
Yang, B., Fu, X., Sidiropoulos, N.D. & Hong, M.. (2017). Towards K-means-friendly Spaces: Simultaneous Deep Learning and Clustering. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:3861-3870 Available from https://proceedings.mlr.press/v70/yang17b.html.

Related Material