Conformal Stability Measure of Feature Selection Algorithms

Marcos López-De-Castro, Alberto García-Galindo, Rubén Armañanzas
Proceedings of the Thirteenth Symposium on Conformal and Probabilistic Prediction with Applications, PMLR 230:105-119, 2024.

Abstract

Quantifying the stability of feature selection techniques has been an ongoing challenge over the last two decades. A large number of stability estimators have been proposed to overcome this problem, but performance guarantees based on suitable statistical frameworks are lacking. A recently developed framework proposed a new and robust estimator of the stability and a method to quantify the uncertainty of the estimates through approximate confidence intervals. Unfortunately, this statistical framework is based on asymptotic assumptions. In situations in which a low number of subsets of selected features are available for the quantification of the stability estimator, the coverage guarantees provided by this framework do not hold. In this work, we propose a method to estimate stability and achieve validity in a situation where only a few samples are available. We take advantage of the Conformal Prediction framework, constructing prediction intervals without any assumption about the underlying distribution of data. Extensive simulations show that our method successfully achieves conservative validity. Furthermore, as the number of available samples increases efficiency is also achieved. Comparisons between prediction intervals and confidence intervals show an acceptable trade-off between coverage guarantees and the interval length for the former, while there is a clear miscoverage for the latter.

Cite this Paper


BibTeX
@InProceedings{pmlr-v230-lopez-de-castro24a, title = {Conformal Stability Measure of Feature Selection Algorithms}, author = {L\'{o}pez-De-Castro, Marcos and Garc\'{i}a-Galindo, Alberto and Arma\~{n}anzas, Rub\'{e}n}, booktitle = {Proceedings of the Thirteenth Symposium on Conformal and Probabilistic Prediction with Applications}, pages = {105--119}, year = {2024}, editor = {Vantini, Simone and Fontana, Matteo and Solari, Aldo and Boström, Henrik and Carlsson, Lars}, volume = {230}, series = {Proceedings of Machine Learning Research}, month = {09--11 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v230/main/assets/lopez-de-castro24a/lopez-de-castro24a.pdf}, url = {https://proceedings.mlr.press/v230/lopez-de-castro24a.html}, abstract = {Quantifying the stability of feature selection techniques has been an ongoing challenge over the last two decades. A large number of stability estimators have been proposed to overcome this problem, but performance guarantees based on suitable statistical frameworks are lacking. A recently developed framework proposed a new and robust estimator of the stability and a method to quantify the uncertainty of the estimates through approximate confidence intervals. Unfortunately, this statistical framework is based on asymptotic assumptions. In situations in which a low number of subsets of selected features are available for the quantification of the stability estimator, the coverage guarantees provided by this framework do not hold. In this work, we propose a method to estimate stability and achieve validity in a situation where only a few samples are available. We take advantage of the Conformal Prediction framework, constructing prediction intervals without any assumption about the underlying distribution of data. Extensive simulations show that our method successfully achieves conservative validity. Furthermore, as the number of available samples increases efficiency is also achieved. Comparisons between prediction intervals and confidence intervals show an acceptable trade-off between coverage guarantees and the interval length for the former, while there is a clear miscoverage for the latter.} }
Endnote
%0 Conference Paper %T Conformal Stability Measure of Feature Selection Algorithms %A Marcos López-De-Castro %A Alberto García-Galindo %A Rubén Armañanzas %B Proceedings of the Thirteenth Symposium on Conformal and Probabilistic Prediction with Applications %C Proceedings of Machine Learning Research %D 2024 %E Simone Vantini %E Matteo Fontana %E Aldo Solari %E Henrik Boström %E Lars Carlsson %F pmlr-v230-lopez-de-castro24a %I PMLR %P 105--119 %U https://proceedings.mlr.press/v230/lopez-de-castro24a.html %V 230 %X Quantifying the stability of feature selection techniques has been an ongoing challenge over the last two decades. A large number of stability estimators have been proposed to overcome this problem, but performance guarantees based on suitable statistical frameworks are lacking. A recently developed framework proposed a new and robust estimator of the stability and a method to quantify the uncertainty of the estimates through approximate confidence intervals. Unfortunately, this statistical framework is based on asymptotic assumptions. In situations in which a low number of subsets of selected features are available for the quantification of the stability estimator, the coverage guarantees provided by this framework do not hold. In this work, we propose a method to estimate stability and achieve validity in a situation where only a few samples are available. We take advantage of the Conformal Prediction framework, constructing prediction intervals without any assumption about the underlying distribution of data. Extensive simulations show that our method successfully achieves conservative validity. Furthermore, as the number of available samples increases efficiency is also achieved. Comparisons between prediction intervals and confidence intervals show an acceptable trade-off between coverage guarantees and the interval length for the former, while there is a clear miscoverage for the latter.
APA
López-De-Castro, M., García-Galindo, A. & Armañanzas, R.. (2024). Conformal Stability Measure of Feature Selection Algorithms. Proceedings of the Thirteenth Symposium on Conformal and Probabilistic Prediction with Applications, in Proceedings of Machine Learning Research 230:105-119 Available from https://proceedings.mlr.press/v230/lopez-de-castro24a.html.

Related Material