Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR 89:1275-1283, 2019.
Conditional Gradients (aka Frank-Wolfe algorithms) form a classical set of methods for constrained smooth convex minimization due to their simplicity, the absence of projection step, and competitive numerical performance. While the vanilla Frank-Wolfe algorithm only ensures a worst-case rate of $O(1/\epsilon)$, various recent results have shown that for strongly convex functions, the method can be slightly modified to achieve linear convergence. However, this still leaves a huge gap between sublinear $O(1/\epsilon)$ convergence and linear $O(\log 1/\epsilon)$ convergence to reach an $\epsilon$-approximate solution. Here, we present a new variant of Conditional Gradients, that can dynamically adapt to the function’s geometric properties using restarts and thus smoothly interpolates between the sublinear and linear regimes. Furthermore, our results apply to generic compact convex constraint sets.