Unsupervised Variable Selection: when random rankings sound as irrelevancy
Proceedings of the Workshop on New Challenges for Feature Selection in Data Mining and Knowledge Discovery at ECML/PKDD 2008, PMLR 4:163-177, 2008.
Whereas the variable selection has been extensively studied in the context of supervised learning, the unsupervised variable selection has attracted attention of researchers more recently as the available amount of unlabeled data has exploded. Many unsupervised variable ranking criteria were proposed and their relevance is usually demonstrated using either external cluster validity indexes or the accuracy of a classifier which are both supervised criteria. Actually, the major issue of the variable subset selection according to a ranking measure has been adressed only by few authors in the unsupervised learning context. In this paper, we propose to combine multiple ranking to go ahead toward a stable consensus variable subset in a totally unsupervised fashion.