Multi-Source Feature Selection via Geometry-Dependent Covariance Analysis

Zheng Zhao, Huan Liu
Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008, PMLR 4:36-47, 2008.

Abstract

Feature selection is an effective approach to reducing dimensionality by selecting relevant original features. In this work, we studied a novel problem of multi-source feature selection for unlabeled data: given multiple heterogeneous data sources (or data sets), select features from one source of interest by integrating information from various data sources. In essence, we investigate how we can employ the information contained in multiple data sources to effectively derive intrinsic relationships that can help select more meaningful (or domain relevant) features. We studied how to adjust the covariance matrix of a data set using the geometric structure obtained from multiple data sources, and how to select features of the target source using geometry-dependent covariance. We designed and conducted experiments to systematically compare the proposed approach with representative methods in our attempt to solve the novel problem of multi-source feature selection. The empirical study demonstrated the efficacy and potential of multi-source feature selection.

Cite this Paper


BibTeX
@InProceedings{pmlr-v4-zhao08a, title = {Multi-Source Feature Selection via Geometry-Dependent Covariance Analysis}, author = {Zhao, Zheng and Liu, Huan}, booktitle = {Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008}, pages = {36--47}, year = {2008}, editor = {Saeys, Yvan and Liu, Huan and Inza, Iñaki and Wehenkel, Louis and Pee, Yves Van de}, volume = {4}, series = {Proceedings of Machine Learning Research}, address = {Antwerp, Belgium}, month = {15 Sep}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v4/zhao08a/zhao08a.pdf}, url = {https://proceedings.mlr.press/v4/zhao08a.html}, abstract = {Feature selection is an effective approach to reducing dimensionality by selecting relevant original features. In this work, we studied a novel problem of multi-source feature selection for unlabeled data: given multiple heterogeneous data sources (or data sets), select features from one source of interest by integrating information from various data sources. In essence, we investigate how we can employ the information contained in multiple data sources to effectively derive intrinsic relationships that can help select more meaningful (or domain relevant) features. We studied how to adjust the covariance matrix of a data set using the geometric structure obtained from multiple data sources, and how to select features of the target source using geometry-dependent covariance. We designed and conducted experiments to systematically compare the proposed approach with representative methods in our attempt to solve the novel problem of multi-source feature selection. The empirical study demonstrated the efficacy and potential of multi-source feature selection.} }
Endnote
%0 Conference Paper %T Multi-Source Feature Selection via Geometry-Dependent Covariance Analysis %A Zheng Zhao %A Huan Liu %B Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008 %C Proceedings of Machine Learning Research %D 2008 %E Yvan Saeys %E Huan Liu %E Iñaki Inza %E Louis Wehenkel %E Yves Van de Pee %F pmlr-v4-zhao08a %I PMLR %P 36--47 %U https://proceedings.mlr.press/v4/zhao08a.html %V 4 %X Feature selection is an effective approach to reducing dimensionality by selecting relevant original features. In this work, we studied a novel problem of multi-source feature selection for unlabeled data: given multiple heterogeneous data sources (or data sets), select features from one source of interest by integrating information from various data sources. In essence, we investigate how we can employ the information contained in multiple data sources to effectively derive intrinsic relationships that can help select more meaningful (or domain relevant) features. We studied how to adjust the covariance matrix of a data set using the geometric structure obtained from multiple data sources, and how to select features of the target source using geometry-dependent covariance. We designed and conducted experiments to systematically compare the proposed approach with representative methods in our attempt to solve the novel problem of multi-source feature selection. The empirical study demonstrated the efficacy and potential of multi-source feature selection.
RIS
TY - CPAPER TI - Multi-Source Feature Selection via Geometry-Dependent Covariance Analysis AU - Zheng Zhao AU - Huan Liu BT - Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008 DA - 2008/09/11 ED - Yvan Saeys ED - Huan Liu ED - Iñaki Inza ED - Louis Wehenkel ED - Yves Van de Pee ID - pmlr-v4-zhao08a PB - PMLR DP - Proceedings of Machine Learning Research VL - 4 SP - 36 EP - 47 L1 - http://proceedings.mlr.press/v4/zhao08a/zhao08a.pdf UR - https://proceedings.mlr.press/v4/zhao08a.html AB - Feature selection is an effective approach to reducing dimensionality by selecting relevant original features. In this work, we studied a novel problem of multi-source feature selection for unlabeled data: given multiple heterogeneous data sources (or data sets), select features from one source of interest by integrating information from various data sources. In essence, we investigate how we can employ the information contained in multiple data sources to effectively derive intrinsic relationships that can help select more meaningful (or domain relevant) features. We studied how to adjust the covariance matrix of a data set using the geometric structure obtained from multiple data sources, and how to select features of the target source using geometry-dependent covariance. We designed and conducted experiments to systematically compare the proposed approach with representative methods in our attempt to solve the novel problem of multi-source feature selection. The empirical study demonstrated the efficacy and potential of multi-source feature selection. ER -
APA
Zhao, Z. & Liu, H.. (2008). Multi-Source Feature Selection via Geometry-Dependent Covariance Analysis. Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008, in Proceedings of Machine Learning Research 4:36-47 Available from https://proceedings.mlr.press/v4/zhao08a.html.

Related Material