Kernel Extraction via Voted Risk Minimization
Proceedings of the 1st International Workshop on Feature Extraction: Modern Questions and Challenges at NIPS 2015, PMLR 44:72-89, 2015.
This paper studies a new framework for learning a predictor in the presence of multiple kernel functions where the learner selects or extracts several kernel functions from potentially complex families and finds an accurate predictor defined in terms of these functions. We present an algorithm, Voted Kernel Regularization, that provides the flexibility of using very complex kernel functions such as predictors based on high-degree polynomial kernels or narrow Gaussian kernels, while benefitting from strong learning guarantees. We show that our algorithm benefits from strong learning guarantees suggesting a new regularization penalty depending on the Rademacher complexities of the families of kernel functions used. Our algorithm admits several other favorable properties: its optimization problem is convex, it allows for learning with non-PDS kernels, and the solutions are highly sparse, resulting in improved classification speed and memory requirements. We report the results of some preliminary experiments comparing the performance of our algorithm to several baselines.