Stochastic Unsupervised Learning on Unlabeled Data

Chuanren Liu, Jianjun Xie, Yong Ge, Hui Xiong
Proceedings of ICML Workshop on Unsupervised and Transfer Learning, PMLR 27:111-122, 2012.

Abstract

In this paper, we introduce a stochastic unsupervised learning method that was used in the 2011 Unsupervised and Transfer Learning (UTL) challenge. This method is developed to preprocess the data that will be used in the subsequent classification problems. Specifically, it performs K-means clustering on principal components instead of raw data to remove the impact of noisy/irrelevant/less-relevant features and improve the robustness of the results. To alleviate the overfitting problem, we also utilize a stochastic process to combine multiple clustering assignments on each data point. Finally, promising results were observed on all the test data sets. Indeed, this proposed method won us the second place in the overall performance of the challenge.

Cite this Paper


BibTeX
@InProceedings{pmlr-v27-liu12a, title = {Stochastic Unsupervised Learning on Unlabeled Data}, author = {Liu, Chuanren and Xie, Jianjun and Ge, Yong and Xiong, Hui}, booktitle = {Proceedings of ICML Workshop on Unsupervised and Transfer Learning}, pages = {111--122}, year = {2012}, editor = {Guyon, Isabelle and Dror, Gideon and Lemaire, Vincent and Taylor, Graham and Silver, Daniel}, volume = {27}, series = {Proceedings of Machine Learning Research}, address = {Bellevue, Washington, USA}, month = {02 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v27/liu12a/liu12a.pdf}, url = { http://proceedings.mlr.press/v27/liu12a.html }, abstract = {In this paper, we introduce a stochastic unsupervised learning method that was used in the 2011 Unsupervised and Transfer Learning (UTL) challenge. This method is developed to preprocess the data that will be used in the subsequent classification problems. Specifically, it performs K-means clustering on principal components instead of raw data to remove the impact of noisy/irrelevant/less-relevant features and improve the robustness of the results. To alleviate the overfitting problem, we also utilize a stochastic process to combine multiple clustering assignments on each data point. Finally, promising results were observed on all the test data sets. Indeed, this proposed method won us the second place in the overall performance of the challenge.} }
Endnote
%0 Conference Paper %T Stochastic Unsupervised Learning on Unlabeled Data %A Chuanren Liu %A Jianjun Xie %A Yong Ge %A Hui Xiong %B Proceedings of ICML Workshop on Unsupervised and Transfer Learning %C Proceedings of Machine Learning Research %D 2012 %E Isabelle Guyon %E Gideon Dror %E Vincent Lemaire %E Graham Taylor %E Daniel Silver %F pmlr-v27-liu12a %I PMLR %P 111--122 %U http://proceedings.mlr.press/v27/liu12a.html %V 27 %X In this paper, we introduce a stochastic unsupervised learning method that was used in the 2011 Unsupervised and Transfer Learning (UTL) challenge. This method is developed to preprocess the data that will be used in the subsequent classification problems. Specifically, it performs K-means clustering on principal components instead of raw data to remove the impact of noisy/irrelevant/less-relevant features and improve the robustness of the results. To alleviate the overfitting problem, we also utilize a stochastic process to combine multiple clustering assignments on each data point. Finally, promising results were observed on all the test data sets. Indeed, this proposed method won us the second place in the overall performance of the challenge.
RIS
TY - CPAPER TI - Stochastic Unsupervised Learning on Unlabeled Data AU - Chuanren Liu AU - Jianjun Xie AU - Yong Ge AU - Hui Xiong BT - Proceedings of ICML Workshop on Unsupervised and Transfer Learning DA - 2012/06/27 ED - Isabelle Guyon ED - Gideon Dror ED - Vincent Lemaire ED - Graham Taylor ED - Daniel Silver ID - pmlr-v27-liu12a PB - PMLR DP - Proceedings of Machine Learning Research VL - 27 SP - 111 EP - 122 L1 - http://proceedings.mlr.press/v27/liu12a/liu12a.pdf UR - http://proceedings.mlr.press/v27/liu12a.html AB - In this paper, we introduce a stochastic unsupervised learning method that was used in the 2011 Unsupervised and Transfer Learning (UTL) challenge. This method is developed to preprocess the data that will be used in the subsequent classification problems. Specifically, it performs K-means clustering on principal components instead of raw data to remove the impact of noisy/irrelevant/less-relevant features and improve the robustness of the results. To alleviate the overfitting problem, we also utilize a stochastic process to combine multiple clustering assignments on each data point. Finally, promising results were observed on all the test data sets. Indeed, this proposed method won us the second place in the overall performance of the challenge. ER -
APA
Liu, C., Xie, J., Ge, Y. & Xiong, H.. (2012). Stochastic Unsupervised Learning on Unlabeled Data. Proceedings of ICML Workshop on Unsupervised and Transfer Learning, in Proceedings of Machine Learning Research 27:111-122 Available from http://proceedings.mlr.press/v27/liu12a.html .

Related Material