Improper Deep Kernels
Proceedings of the 19th International Conference on Artificial Intelligence and Statistics, PMLR 51:1159-1167, 2016.
Neural networks have recently re-emerged as a powerful hypothesis class, yielding impressive classification accuracy in multiple domains. However, their training is a non convex optimization problem. Here we address this difficulty by turning to "improper learning" of neural nets. In other words, we learn a classifier that is not a neural net but is competitive with the best neural net model given a sufficient number of training examples. Our approach relies on a novel kernel which integrates over the set of possible neural models. It turns out that the corresponding integral can be evaluated in closed form via a simple recursion. The learning problem is then an SVM with this kernel, and a global optimum can thus be found efficiently. We also provide sample complexity results which depend on the stability of the optimal neural net.