Second-Order Kernel Online Convex Optimization with Adaptive Sketching


Daniele Calandriello, Alessandro Lazaric, Michal Valko ;
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:645-653, 2017.


Kernel online convex optimization (KOCO) is a framework combining the expressiveness of non-parametric kernel models with the regret guarantees of online learning. First-order KOCO methods such as functional gradient descent require only $O(t)$ time and space per iteration, and, when the only information on the losses is their convexity, achieve a minimax optimal $O(\sqrt{T})$ regret. Nonetheless, many common losses in kernel problems, such as squared loss, logistic loss, and squared hinge loss posses stronger curvature that can be exploited. In this case, second-order KOCO methods achieve $O(\log(\mathrm{Det}(K)))$ regret, which we show scales as $O(deff \log T)$, where $deff$ is the effective dimension of the problem and is usually much smaller than $O(\sqrt{T})$. The main drawback of second-order methods is their much higher $O(t^2)$ space and time complexity. In this paper, we introduce kernel online Newton step (KONS), a new second-order KOCO method that also achieves $O(deff\log T)$ regret. To address the computational complexity of second-order methods, we introduce a new matrix sketching algorithm for the kernel matrix~$K$, and show that for a chosen parameter $\gamma \leq 1$ our Sketched-KONS reduces the space and time complexity by a factor of $\gamma^2$ to $O(t^2\gamma^2)$ space and time per iteration, while incurring only $1/\gamma$ times more regret.

Related Material