Best Agglomerative Ranked Subset for Feature Selection

Roberto Ruiz; José C. Riquelme; Jesús S. Aguilar-Ruiz

Best Agglomerative Ranked Subset for Feature Selection

Roberto Ruiz, José C. Riquelme, Jesús S. Aguilar-Ruiz

Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008, PMLR 4:148-162, 2008.

Abstract

The enormous increase of the size in databases makes finding an optimal subset of features extremely difficult. In this paper, a new feature selection method is proposed that will allow any subset evaluator -including the wrapper evaluation method- to be used to find a group of features that will allow a distinction to be made between the different possible classes. The method, BARS (Best Agglomerative Ranked Subset), is based on the idea of relevance and redundancy, in the sense that a ranked feature (or set) is more relevant if it adds information when it is included in the final subset of selected features. This heuristic method reduces dimensionality drastically and leads to improvements in the accuracy, in comparison to a complete set and as opposed to other feature selection algorithms.

Cite this Paper

BibTeX


@InProceedings{pmlr-v4-ruiz08a,
  title = 	 {Best Agglomerative Ranked Subset for Feature Selection},
  author = 	 {Ruiz, Roberto and Riquelme, José C. and Aguilar-Ruiz, Jesús S.},
  booktitle = 	 {Proceedings of the Workshop on New Challenges for Feature Selection in Data  Mining and Knowledge Discovery at ECML/PKDD 2008},
  pages = 	 {148--162},
  year = 	 {2008},
  editor = 	 {Saeys, Yvan and Liu, Huan and Inza, Iñaki and Wehenkel, Louis and Pee, Yves Van de},
  volume = 	 {4},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Antwerp, Belgium},
  month = 	 {15 Sep},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v4/ruiz08a/ruiz08a.pdf},
  url = 	 {https://proceedings.mlr.press/v4/ruiz08a.html},
  abstract = 	 {The enormous increase of the size in databases makes finding an optimal subset  of features extremely difficult. In this paper, a new feature selection method  is proposed that will allow any subset evaluator -including the wrapper  evaluation method- to be used to find a group of features that will allow a  distinction to be made between the different possible classes. The method,  BARS (Best Agglomerative Ranked Subset), is based on the idea of relevance and  redundancy, in the sense that a ranked feature (or set) is more relevant if it  adds information when it is included in the final subset of selected features.  This heuristic method reduces dimensionality drastically and leads to improvements  in the accuracy, in comparison to a complete set and as opposed to other feature  selection algorithms.}
}

Endnote

%0 Conference Paper
%T Best Agglomerative Ranked Subset for Feature Selection
%A Roberto Ruiz
%A José C. Riquelme
%A Jesús S. Aguilar-Ruiz
%B Proceedings of the Workshop on New Challenges for Feature Selection in Data  Mining and Knowledge Discovery at ECML/PKDD 2008
%C Proceedings of Machine Learning Research
%D 2008
%E Yvan Saeys
%E Huan Liu
%E Iñaki Inza
%E Louis Wehenkel
%E Yves Van de Pee	
%F pmlr-v4-ruiz08a
%I PMLR
%P 148--162
%U https://proceedings.mlr.press/v4/ruiz08a.html
%V 4
%X The enormous increase of the size in databases makes finding an optimal subset  of features extremely difficult. In this paper, a new feature selection method  is proposed that will allow any subset evaluator -including the wrapper  evaluation method- to be used to find a group of features that will allow a  distinction to be made between the different possible classes. The method,  BARS (Best Agglomerative Ranked Subset), is based on the idea of relevance and  redundancy, in the sense that a ranked feature (or set) is more relevant if it  adds information when it is included in the final subset of selected features.  This heuristic method reduces dimensionality drastically and leads to improvements  in the accuracy, in comparison to a complete set and as opposed to other feature  selection algorithms.

RIS


TY  - CPAPER
TI  - Best Agglomerative Ranked Subset for Feature Selection
AU  - Roberto Ruiz
AU  - José C. Riquelme
AU  - Jesús S. Aguilar-Ruiz
BT  - Proceedings of the Workshop on New Challenges for Feature Selection in Data  Mining and Knowledge Discovery at ECML/PKDD 2008
DA  - 2008/09/11
ED  - Yvan Saeys
ED  - Huan Liu
ED  - Iñaki Inza
ED  - Louis Wehenkel
ED  - Yves Van de Pee	
ID  - pmlr-v4-ruiz08a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 4
SP  - 148
EP  - 162
L1  - http://proceedings.mlr.press/v4/ruiz08a/ruiz08a.pdf
UR  - https://proceedings.mlr.press/v4/ruiz08a.html
AB  - The enormous increase of the size in databases makes finding an optimal subset  of features extremely difficult. In this paper, a new feature selection method  is proposed that will allow any subset evaluator -including the wrapper  evaluation method- to be used to find a group of features that will allow a  distinction to be made between the different possible classes. The method,  BARS (Best Agglomerative Ranked Subset), is based on the idea of relevance and  redundancy, in the sense that a ranked feature (or set) is more relevant if it  adds information when it is included in the final subset of selected features.  This heuristic method reduces dimensionality drastically and leads to improvements  in the accuracy, in comparison to a complete set and as opposed to other feature  selection algorithms.
ER  -

APA


Ruiz, R., Riquelme, J.C. & Aguilar-Ruiz, J.S.. (2008). Best Agglomerative Ranked Subset for Feature Selection. Proceedings of the Workshop on New Challenges for Feature Selection in Data  Mining and Knowledge Discovery at ECML/PKDD 2008, in Proceedings of Machine Learning Research 4:148-162 Available from https://proceedings.mlr.press/v4/ruiz08a.html.

Related Material

Download PDF