Feature screening with kernel knockoffs
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:1935-1974, 2022.
This article analyses three feature screening procedures: Kendall’s Tau and Spearman Rho (TR), Hilbert-Schmidt Independence Criterion (HSIC) and conditional Maximum Mean Discrepancy (cMMD), where the latter is a modified version of the standard MMD for categorical classification. These association measures are not based on any specific underlying model, such as the linear regression. We provide the conditions for which the sure independence screening (SIS) property is satisfied under a lower bound assumption on the minimum signal strength of the association measure. The SIS property for the HSIC and cMMD is established for given bounded and symmetric kernels. Within the high-dimensional setting, we propose a two-step approach to control the false discovery rate (FDR) using the knockoff filtering. The performances of the association measures are assessed through simulated and real data experiments and compared with existing competing screening methods.