Unsupervised Feature Selection by Preserving Stochastic Neighbors

Xiaokai Wei; Philip S. Yu

Unsupervised Feature Selection by Preserving Stochastic Neighbors

Xiaokai Wei, Philip S. Yu

Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:995-1003, 2016.

Abstract

Feature selection is an important technique for alleviating the curse of dimensionality. Unsupervised feature selection is more challenging than its supervised counterpart due to the lack of labels. In this paper, we present an effective method, Stochastic Neighbor-preserving Feature Selection (SNFS), for selecting discriminative features in unsupervised setting. We employ the concept of stochastic neighbors and select the features that can best preserve such stochastic neighbors by minimizing the Kullback-Leibler (KL) Divergence between neighborhood distributions. The proposed approach measures feature utility jointly in a non-linear way and discriminative features can be selected due to its ’push-pull’ property. We develop an efficient algorithm for optimizing the objective function based on projected quasi-Newton method. Moreover, few existing methods provide ways for determining the optimal number of selected features and this hampers their utility in practice. Our approach is equipped with a guideline for choosing the number of features, which provides nearly optimal performance in our experiments. Experimental results show that the proposed method outperforms state-of-the-art methods significantly on several real-world datasets.

Cite this Paper

BibTeX


@InProceedings{pmlr-v51-wei16,
  title = 	 {Unsupervised Feature Selection by Preserving Stochastic Neighbors},
  author = 	 {Wei, Xiaokai and Yu, Philip S.},
  booktitle = 	 {Proceedings of the 19th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {995--1003},
  year = 	 {2016},
  editor = 	 {Gretton, Arthur and Robert, Christian C.},
  volume = 	 {51},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Cadiz, Spain},
  month = 	 {09--11 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v51/wei16.pdf},
  url = 	 {https://proceedings.mlr.press/v51/wei16.html},
  abstract = 	 {Feature selection is an important technique for alleviating the curse of dimensionality. Unsupervised feature selection is more challenging than its supervised counterpart due to the lack of labels. In this paper, we present an effective method, Stochastic Neighbor-preserving Feature Selection (SNFS), for selecting discriminative features in unsupervised setting. We employ the concept of stochastic neighbors and select the features that can best preserve such stochastic neighbors by minimizing the Kullback-Leibler (KL) Divergence between neighborhood distributions. The proposed approach measures feature utility jointly in a non-linear way and discriminative features can be selected due to its ’push-pull’ property. We develop an efficient algorithm for optimizing the objective function based on projected quasi-Newton method. Moreover, few existing methods provide ways for determining the optimal number of selected features and this hampers their utility in practice. Our approach is equipped with a guideline for choosing the number of features, which provides nearly optimal performance in our experiments. Experimental results show that the proposed method outperforms state-of-the-art methods significantly on several real-world datasets.}
}

Endnote

%0 Conference Paper
%T Unsupervised Feature Selection by Preserving Stochastic Neighbors
%A Xiaokai Wei
%A Philip S. Yu
%B Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2016
%E Arthur Gretton
%E Christian C. Robert	
%F pmlr-v51-wei16
%I PMLR
%P 995--1003
%U https://proceedings.mlr.press/v51/wei16.html
%V 51
%X Feature selection is an important technique for alleviating the curse of dimensionality. Unsupervised feature selection is more challenging than its supervised counterpart due to the lack of labels. In this paper, we present an effective method, Stochastic Neighbor-preserving Feature Selection (SNFS), for selecting discriminative features in unsupervised setting. We employ the concept of stochastic neighbors and select the features that can best preserve such stochastic neighbors by minimizing the Kullback-Leibler (KL) Divergence between neighborhood distributions. The proposed approach measures feature utility jointly in a non-linear way and discriminative features can be selected due to its ’push-pull’ property. We develop an efficient algorithm for optimizing the objective function based on projected quasi-Newton method. Moreover, few existing methods provide ways for determining the optimal number of selected features and this hampers their utility in practice. Our approach is equipped with a guideline for choosing the number of features, which provides nearly optimal performance in our experiments. Experimental results show that the proposed method outperforms state-of-the-art methods significantly on several real-world datasets.

RIS


TY  - CPAPER
TI  - Unsupervised Feature Selection by Preserving Stochastic Neighbors
AU  - Xiaokai Wei
AU  - Philip S. Yu
BT  - Proceedings of the 19th International Conference on Artificial Intelligence and Statistics
DA  - 2016/05/02
ED  - Arthur Gretton
ED  - Christian C. Robert	
ID  - pmlr-v51-wei16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 51
SP  - 995
EP  - 1003
L1  - http://proceedings.mlr.press/v51/wei16.pdf
UR  - https://proceedings.mlr.press/v51/wei16.html
AB  - Feature selection is an important technique for alleviating the curse of dimensionality. Unsupervised feature selection is more challenging than its supervised counterpart due to the lack of labels. In this paper, we present an effective method, Stochastic Neighbor-preserving Feature Selection (SNFS), for selecting discriminative features in unsupervised setting. We employ the concept of stochastic neighbors and select the features that can best preserve such stochastic neighbors by minimizing the Kullback-Leibler (KL) Divergence between neighborhood distributions. The proposed approach measures feature utility jointly in a non-linear way and discriminative features can be selected due to its ’push-pull’ property. We develop an efficient algorithm for optimizing the objective function based on projected quasi-Newton method. Moreover, few existing methods provide ways for determining the optimal number of selected features and this hampers their utility in practice. Our approach is equipped with a guideline for choosing the number of features, which provides nearly optimal performance in our experiments. Experimental results show that the proposed method outperforms state-of-the-art methods significantly on several real-world datasets.
ER  -

APA


Wei, X. & Yu, P.S.. (2016). Unsupervised Feature Selection by Preserving Stochastic Neighbors. Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 51:995-1003 Available from https://proceedings.mlr.press/v51/wei16.html.

Unsupervised Feature Selection by Preserving Stochastic Neighbors

Abstract

Cite this Paper

Related Material