Best Agglomerative Ranked Subset for Feature Selection

Roberto Ruiz, José C. Riquelme, Jesús S. Aguilar-Ruiz
Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008, PMLR 4:148-162, 2008.

Abstract

The enormous increase of the size in databases makes finding an optimal subset of features extremely difficult. In this paper, a new feature selection method is proposed that will allow any subset evaluator -including the wrapper evaluation method- to be used to find a group of features that will allow a distinction to be made between the different possible classes. The method, BARS (Best Agglomerative Ranked Subset), is based on the idea of relevance and redundancy, in the sense that a ranked feature (or set) is more relevant if it adds information when it is included in the final subset of selected features. This heuristic method reduces dimensionality drastically and leads to improvements in the accuracy, in comparison to a complete set and as opposed to other feature selection algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v4-ruiz08a, title = {Best Agglomerative Ranked Subset for Feature Selection}, author = {Ruiz, Roberto and Riquelme, José C. and Aguilar-Ruiz, Jesús S.}, booktitle = {Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008}, pages = {148--162}, year = {2008}, editor = {Saeys, Yvan and Liu, Huan and Inza, Iñaki and Wehenkel, Louis and Pee, Yves Van de}, volume = {4}, series = {Proceedings of Machine Learning Research}, address = {Antwerp, Belgium}, month = {15 Sep}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v4/ruiz08a/ruiz08a.pdf}, url = {https://proceedings.mlr.press/v4/ruiz08a.html}, abstract = {The enormous increase of the size in databases makes finding an optimal subset of features extremely difficult. In this paper, a new feature selection method is proposed that will allow any subset evaluator -including the wrapper evaluation method- to be used to find a group of features that will allow a distinction to be made between the different possible classes. The method, BARS (Best Agglomerative Ranked Subset), is based on the idea of relevance and redundancy, in the sense that a ranked feature (or set) is more relevant if it adds information when it is included in the final subset of selected features. This heuristic method reduces dimensionality drastically and leads to improvements in the accuracy, in comparison to a complete set and as opposed to other feature selection algorithms.} }
Endnote
%0 Conference Paper %T Best Agglomerative Ranked Subset for Feature Selection %A Roberto Ruiz %A José C. Riquelme %A Jesús S. Aguilar-Ruiz %B Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008 %C Proceedings of Machine Learning Research %D 2008 %E Yvan Saeys %E Huan Liu %E Iñaki Inza %E Louis Wehenkel %E Yves Van de Pee %F pmlr-v4-ruiz08a %I PMLR %P 148--162 %U https://proceedings.mlr.press/v4/ruiz08a.html %V 4 %X The enormous increase of the size in databases makes finding an optimal subset of features extremely difficult. In this paper, a new feature selection method is proposed that will allow any subset evaluator -including the wrapper evaluation method- to be used to find a group of features that will allow a distinction to be made between the different possible classes. The method, BARS (Best Agglomerative Ranked Subset), is based on the idea of relevance and redundancy, in the sense that a ranked feature (or set) is more relevant if it adds information when it is included in the final subset of selected features. This heuristic method reduces dimensionality drastically and leads to improvements in the accuracy, in comparison to a complete set and as opposed to other feature selection algorithms.
RIS
TY - CPAPER TI - Best Agglomerative Ranked Subset for Feature Selection AU - Roberto Ruiz AU - José C. Riquelme AU - Jesús S. Aguilar-Ruiz BT - Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008 DA - 2008/09/11 ED - Yvan Saeys ED - Huan Liu ED - Iñaki Inza ED - Louis Wehenkel ED - Yves Van de Pee ID - pmlr-v4-ruiz08a PB - PMLR DP - Proceedings of Machine Learning Research VL - 4 SP - 148 EP - 162 L1 - http://proceedings.mlr.press/v4/ruiz08a/ruiz08a.pdf UR - https://proceedings.mlr.press/v4/ruiz08a.html AB - The enormous increase of the size in databases makes finding an optimal subset of features extremely difficult. In this paper, a new feature selection method is proposed that will allow any subset evaluator -including the wrapper evaluation method- to be used to find a group of features that will allow a distinction to be made between the different possible classes. The method, BARS (Best Agglomerative Ranked Subset), is based on the idea of relevance and redundancy, in the sense that a ranked feature (or set) is more relevant if it adds information when it is included in the final subset of selected features. This heuristic method reduces dimensionality drastically and leads to improvements in the accuracy, in comparison to a complete set and as opposed to other feature selection algorithms. ER -
APA
Ruiz, R., Riquelme, J.C. & Aguilar-Ruiz, J.S.. (2008). Best Agglomerative Ranked Subset for Feature Selection. Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008, in Proceedings of Machine Learning Research 4:148-162 Available from https://proceedings.mlr.press/v4/ruiz08a.html.

Related Material