Frank-Wolfe with Subsampling Oracle

Thomas Kerdreux; Fabian Pedregosa; Alexandre d’Aspremont

Frank-Wolfe with Subsampling Oracle

Thomas Kerdreux, Fabian Pedregosa, Alexandre d’Aspremont

Proceedings of the 35th International Conference on Machine Learning, PMLR 80:2591-2600, 2018.

Abstract

We analyze two novel randomized variants of the Frank-Wolfe (FW) or conditional gradient algorithm. While classical FW algorithms require solving a linear minimization problem over the domain at each iteration, the proposed method only requires to solve a linear minimization problem over a small subset of the original domain. The first algorithm that we propose is a randomized variant of the original FW algorithm and achieves a $\mathcal{O}(1/t)$ sublinear convergence rate as in the deterministic counterpart. The second algorithm is a randomized variant of the Away-step FW algorithm, and again as its deterministic counterpart, reaches linear (i.e., exponential) convergence rate making it the first provably convergent randomized variant of Away-step FW. In both cases, while subsampling reduces the convergence rate by a constant factor, the linear minimization step can be a fraction of the cost of that of the deterministic versions, especially when the data is streamed. We illustrate computational gains of both algorithms on regression problems, involving both $\ell_1$ and latent group lasso penalties.

Cite this Paper

BibTeX


@InProceedings{pmlr-v80-kerdreux18a,
  title = 	 {Frank-{W}olfe with Subsampling Oracle},
  author =       {Kerdreux, Thomas and Pedregosa, Fabian and d'Aspremont, Alexandre},
  booktitle = 	 {Proceedings of the 35th International Conference on Machine Learning},
  pages = 	 {2591--2600},
  year = 	 {2018},
  editor = 	 {Dy, Jennifer and Krause, Andreas},
  volume = 	 {80},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {10--15 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v80/kerdreux18a/kerdreux18a.pdf},
  url = 	 {https://proceedings.mlr.press/v80/kerdreux18a.html},
  abstract = 	 {We analyze two novel randomized variants of the Frank-Wolfe (FW) or conditional gradient algorithm. While classical FW algorithms require solving a linear minimization problem over the domain at each iteration, the proposed method only requires to solve a linear minimization problem over a small subset of the original domain. The first algorithm that we propose is a randomized variant of the original FW algorithm and achieves a $\mathcal{O}(1/t)$ sublinear convergence rate as in the deterministic counterpart. The second algorithm is a randomized variant of the Away-step FW algorithm, and again as its deterministic counterpart, reaches linear (i.e., exponential) convergence rate making it the first provably convergent randomized variant of Away-step FW. In both cases, while subsampling reduces the convergence rate by a constant factor, the linear minimization step can be a fraction of the cost of that of the deterministic versions, especially when the data is streamed. We illustrate computational gains of both algorithms on regression problems, involving both $\ell_1$ and latent group lasso penalties.}
}

Endnote

%0 Conference Paper
%T Frank-Wolfe with Subsampling Oracle
%A Thomas Kerdreux
%A Fabian Pedregosa
%A Alexandre d’Aspremont
%B Proceedings of the 35th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2018
%E Jennifer Dy
%E Andreas Krause	
%F pmlr-v80-kerdreux18a
%I PMLR
%P 2591--2600
%U https://proceedings.mlr.press/v80/kerdreux18a.html
%V 80
%X We analyze two novel randomized variants of the Frank-Wolfe (FW) or conditional gradient algorithm. While classical FW algorithms require solving a linear minimization problem over the domain at each iteration, the proposed method only requires to solve a linear minimization problem over a small subset of the original domain. The first algorithm that we propose is a randomized variant of the original FW algorithm and achieves a $\mathcal{O}(1/t)$ sublinear convergence rate as in the deterministic counterpart. The second algorithm is a randomized variant of the Away-step FW algorithm, and again as its deterministic counterpart, reaches linear (i.e., exponential) convergence rate making it the first provably convergent randomized variant of Away-step FW. In both cases, while subsampling reduces the convergence rate by a constant factor, the linear minimization step can be a fraction of the cost of that of the deterministic versions, especially when the data is streamed. We illustrate computational gains of both algorithms on regression problems, involving both $\ell_1$ and latent group lasso penalties.

APA


Kerdreux, T., Pedregosa, F. & d’Aspremont, A.. (2018). Frank-Wolfe with Subsampling Oracle. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:2591-2600 Available from https://proceedings.mlr.press/v80/kerdreux18a.html.

Frank-Wolfe with Subsampling Oracle

Abstract

Cite this Paper

Related Material