Restarting Frank-Wolfe

Thomas Kerdreux, Alexandre d’Aspremont, Sebastian Pokutta
Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR 89:1275-1283, 2019.

Abstract

Conditional Gradients (aka Frank-Wolfe algorithms) form a classical set of methods for constrained smooth convex minimization due to their simplicity, the absence of projection step, and competitive numerical performance. While the vanilla Frank-Wolfe algorithm only ensures a worst-case rate of $O(1/\epsilon)$, various recent results have shown that for strongly convex functions, the method can be slightly modified to achieve linear convergence. However, this still leaves a huge gap between sublinear $O(1/\epsilon)$ convergence and linear $O(\log 1/\epsilon)$ convergence to reach an $\epsilon$-approximate solution. Here, we present a new variant of Conditional Gradients, that can dynamically adapt to the function’s geometric properties using restarts and thus smoothly interpolates between the sublinear and linear regimes. Furthermore, our results apply to generic compact convex constraint sets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v89-kerdreux19a, title = {Restarting Frank-Wolfe}, author = {Kerdreux, Thomas and d'Aspremont, Alexandre and Pokutta, Sebastian}, booktitle = {Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics}, pages = {1275--1283}, year = {2019}, editor = {Chaudhuri, Kamalika and Sugiyama, Masashi}, volume = {89}, series = {Proceedings of Machine Learning Research}, month = {16--18 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v89/kerdreux19a/kerdreux19a.pdf}, url = {https://proceedings.mlr.press/v89/kerdreux19a.html}, abstract = {Conditional Gradients (aka Frank-Wolfe algorithms) form a classical set of methods for constrained smooth convex minimization due to their simplicity, the absence of projection step, and competitive numerical performance. While the vanilla Frank-Wolfe algorithm only ensures a worst-case rate of $O(1/\epsilon)$, various recent results have shown that for strongly convex functions, the method can be slightly modified to achieve linear convergence. However, this still leaves a huge gap between sublinear $O(1/\epsilon)$ convergence and linear $O(\log 1/\epsilon)$ convergence to reach an $\epsilon$-approximate solution. Here, we present a new variant of Conditional Gradients, that can dynamically adapt to the function’s geometric properties using restarts and thus smoothly interpolates between the sublinear and linear regimes. Furthermore, our results apply to generic compact convex constraint sets.} }
Endnote
%0 Conference Paper %T Restarting Frank-Wolfe %A Thomas Kerdreux %A Alexandre d’Aspremont %A Sebastian Pokutta %B Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Masashi Sugiyama %F pmlr-v89-kerdreux19a %I PMLR %P 1275--1283 %U https://proceedings.mlr.press/v89/kerdreux19a.html %V 89 %X Conditional Gradients (aka Frank-Wolfe algorithms) form a classical set of methods for constrained smooth convex minimization due to their simplicity, the absence of projection step, and competitive numerical performance. While the vanilla Frank-Wolfe algorithm only ensures a worst-case rate of $O(1/\epsilon)$, various recent results have shown that for strongly convex functions, the method can be slightly modified to achieve linear convergence. However, this still leaves a huge gap between sublinear $O(1/\epsilon)$ convergence and linear $O(\log 1/\epsilon)$ convergence to reach an $\epsilon$-approximate solution. Here, we present a new variant of Conditional Gradients, that can dynamically adapt to the function’s geometric properties using restarts and thus smoothly interpolates between the sublinear and linear regimes. Furthermore, our results apply to generic compact convex constraint sets.
APA
Kerdreux, T., d’Aspremont, A. & Pokutta, S.. (2019). Restarting Frank-Wolfe. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 89:1275-1283 Available from https://proceedings.mlr.press/v89/kerdreux19a.html.

Related Material