[edit]
Conformal Stability Measure of Feature Selection Algorithms
Proceedings of the Thirteenth Symposium on Conformal and Probabilistic Prediction with Applications, PMLR 230:105-119, 2024.
Abstract
Quantifying the stability of feature selection techniques has been an ongoing challenge over the last two decades. A large number of stability estimators have been proposed to overcome this problem, but performance guarantees based on suitable statistical frameworks are lacking. A recently developed framework proposed a new and robust estimator of the stability and a method to quantify the uncertainty of the estimates through approximate confidence intervals. Unfortunately, this statistical framework is based on asymptotic assumptions. In situations in which a low number of subsets of selected features are available for the quantification of the stability estimator, the coverage guarantees provided by this framework do not hold. In this work, we propose a method to estimate stability and achieve validity in a situation where only a few samples are available. We take advantage of the Conformal Prediction framework, constructing prediction intervals without any assumption about the underlying distribution of data. Extensive simulations show that our method successfully achieves conservative validity. Furthermore, as the number of available samples increases efficiency is also achieved. Comparisons between prediction intervals and confidence intervals show an acceptable trade-off between coverage guarantees and the interval length for the former, while there is a clear miscoverage for the latter.