Nonlinear Dimensionality Reduction of Data by Deep Distributed Random Samplings


Xiao-Lei Zhang ;
Proceedings of the Sixth Asian Conference on Machine Learning, PMLR 39:221-233, 2015.


Dimensionality reduction is a fundamental problem of machine learning, and has been intensively studied, where classification and clustering are two special cases of dimensionality reduction that reduce high-dimensional data to discrete points. Here we describe a simple multilayer network for dimensionality reduction that each layer of the network is a group of mutually independent k-centers clusterings. We find that the network can be trained successfully layer-by-layer by simply assigning the centers of each clustering by randomly sampled data points from the input. Our results show that the described simple method outperformed 7 well-known dimensionality reduction methods on both very small-scale biomedical data and large-scale image and document data, with less training time than multilayer neural networks on large-scale data.

Related Material