Towards Provable Learning of Polynomial Neural Networks Using LowRank Matrix Estimation
[edit]
Proceedings of the TwentyFirst International Conference on Artificial Intelligence and Statistics, PMLR 84:14171426, 2018.
Abstract
We study the problem of (provably) learning the weights of a twolayer neural network with quadratic activations. In particular, we focus on the underparametrized regime where the number of neurons in the hidden layer is (much) smaller than the dimension of the input. Our approach uses a lifting trick, which enables us to borrow algorithmic ideas from lowrank matrix estimation. In this context, we propose two novel, nonconvex training algorithms which do not need any extra tuning parameters other than the number of hidden neurons. We support our algorithms with rigorous theoretical analysis, and show that the proposed algorithms enjoy linear convergence, fast running time per iteration, and nearoptimal sample complexity. Finally, we complement our theoretical results with several numerical experiments.
Related Material


