Generalization Bounds for Supervised Dimensionality Reduction
Proceedings of the 1st International Workshop on Feature Extraction: Modern Questions and Challenges at NIPS 2015, PMLR 44:226-241, 2015.
We introduce and study the learning scenario of \emphsupervised dimensionality reduction, which couples dimensionality reduction and a subsequent supervised learning step. We present new generalization bounds for this scenario based on a careful analysis of the empirical Rademacher complexity of the relevant hypothesis set. In particular, we show an upper bound on the Rademacher complexity that is in \widetilde O(\sqrt\Lambda_(r)/m), where m is the sample size and \Lambda_(r) the upper bound on the Ky-Fan r-norm of the operator that defines the dimensionality reduction projection. We give both upper and lower bound guarantees in terms of that Ky-Fan r-norm, which strongly justifies the definition of our hypothesis set. To the best of our knowledge, these are the first learning guarantees for the problem of supervised dimensionality reduction with a \emphlearned kernel-based mapping. Our analysis and learning guarantees further apply to several special cases, such as that of using a fixed kernel with supervised dimensionality reduction or that of unsupervised learning of a kernel for dimensionality reduction followed by a supervised learning algorithm.