Nonlinear Dimensionality Reduction of Data by Deep Distributed Random Samplings

Xiao-Lei Zhang
Proceedings of the Sixth Asian Conference on Machine Learning, PMLR 39:221-233, 2015.

Abstract

Dimensionality reduction is a fundamental problem of machine learning, and has been intensively studied, where classification and clustering are two special cases of dimensionality reduction that reduce high-dimensional data to discrete points. Here we describe a simple multilayer network for dimensionality reduction that each layer of the network is a group of mutually independent k-centers clusterings. We find that the network can be trained successfully layer-by-layer by simply assigning the centers of each clustering by randomly sampled data points from the input. Our results show that the described simple method outperformed 7 well-known dimensionality reduction methods on both very small-scale biomedical data and large-scale image and document data, with less training time than multilayer neural networks on large-scale data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v39-zhang14, title = {Nonlinear Dimensionality Reduction of Data by Deep Distributed Random Samplings}, author = {Zhang, Xiao-Lei}, booktitle = {Proceedings of the Sixth Asian Conference on Machine Learning}, pages = {221--233}, year = {2015}, editor = {Phung, Dinh and Li, Hang}, volume = {39}, series = {Proceedings of Machine Learning Research}, address = {Nha Trang City, Vietnam}, month = {26--28 Nov}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v39/zhang14.pdf}, url = {https://proceedings.mlr.press/v39/zhang14.html}, abstract = {Dimensionality reduction is a fundamental problem of machine learning, and has been intensively studied, where classification and clustering are two special cases of dimensionality reduction that reduce high-dimensional data to discrete points. Here we describe a simple multilayer network for dimensionality reduction that each layer of the network is a group of mutually independent k-centers clusterings. We find that the network can be trained successfully layer-by-layer by simply assigning the centers of each clustering by randomly sampled data points from the input. Our results show that the described simple method outperformed 7 well-known dimensionality reduction methods on both very small-scale biomedical data and large-scale image and document data, with less training time than multilayer neural networks on large-scale data.} }
Endnote
%0 Conference Paper %T Nonlinear Dimensionality Reduction of Data by Deep Distributed Random Samplings %A Xiao-Lei Zhang %B Proceedings of the Sixth Asian Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2015 %E Dinh Phung %E Hang Li %F pmlr-v39-zhang14 %I PMLR %P 221--233 %U https://proceedings.mlr.press/v39/zhang14.html %V 39 %X Dimensionality reduction is a fundamental problem of machine learning, and has been intensively studied, where classification and clustering are two special cases of dimensionality reduction that reduce high-dimensional data to discrete points. Here we describe a simple multilayer network for dimensionality reduction that each layer of the network is a group of mutually independent k-centers clusterings. We find that the network can be trained successfully layer-by-layer by simply assigning the centers of each clustering by randomly sampled data points from the input. Our results show that the described simple method outperformed 7 well-known dimensionality reduction methods on both very small-scale biomedical data and large-scale image and document data, with less training time than multilayer neural networks on large-scale data.
RIS
TY - CPAPER TI - Nonlinear Dimensionality Reduction of Data by Deep Distributed Random Samplings AU - Xiao-Lei Zhang BT - Proceedings of the Sixth Asian Conference on Machine Learning DA - 2015/02/16 ED - Dinh Phung ED - Hang Li ID - pmlr-v39-zhang14 PB - PMLR DP - Proceedings of Machine Learning Research VL - 39 SP - 221 EP - 233 L1 - http://proceedings.mlr.press/v39/zhang14.pdf UR - https://proceedings.mlr.press/v39/zhang14.html AB - Dimensionality reduction is a fundamental problem of machine learning, and has been intensively studied, where classification and clustering are two special cases of dimensionality reduction that reduce high-dimensional data to discrete points. Here we describe a simple multilayer network for dimensionality reduction that each layer of the network is a group of mutually independent k-centers clusterings. We find that the network can be trained successfully layer-by-layer by simply assigning the centers of each clustering by randomly sampled data points from the input. Our results show that the described simple method outperformed 7 well-known dimensionality reduction methods on both very small-scale biomedical data and large-scale image and document data, with less training time than multilayer neural networks on large-scale data. ER -
APA
Zhang, X.. (2015). Nonlinear Dimensionality Reduction of Data by Deep Distributed Random Samplings. Proceedings of the Sixth Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 39:221-233 Available from https://proceedings.mlr.press/v39/zhang14.html.

Related Material