Efficient Greedy Coordinate Descent for Composite Problems

Sai Praneeth Karimireddy, Anastasia Koloskova, Sebastian U. Stich, Martin Jaggi
Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR 89:2887-2896, 2019.

Abstract

Coordinate descent with random coordinate selection is the current state of the art for many large scale optimization problems. However, greedy selection of the steepest coordinate on smooth problems can yield convergence rates independent of the dimension $n$, requiring $n$ times fewer iterations. In this paper, we consider greedy updates that are based on subgradients for a class of non-smooth composite problems, including $L1$-regularized problems, SVMs and related applications. For these problems we provide (i) the first linear rates of convergence independent of $n$, and show that our greedy update rule provides speedups similar to those obtained in the smooth case. This was previously conjectured to be true for a stronger greedy coordinate selection strategy. Furthermore, we show that (ii) our new selection rule can be mapped to instances of maximum inner product search, allowing to leverage standard nearest neighbor algorithms to speed up the implementation. We demonstrate the validity of the approach through extensive numerical experiments.

Cite this Paper


BibTeX
@InProceedings{pmlr-v89-karimireddy19a, title = {Efficient Greedy Coordinate Descent for Composite Problems}, author = {Karimireddy, Sai Praneeth and Koloskova, Anastasia and Stich, Sebastian U. and Jaggi, Martin}, booktitle = {Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics}, pages = {2887--2896}, year = {2019}, editor = {Chaudhuri, Kamalika and Sugiyama, Masashi}, volume = {89}, series = {Proceedings of Machine Learning Research}, month = {16--18 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v89/karimireddy19a/karimireddy19a.pdf}, url = {http://proceedings.mlr.press/v89/karimireddy19a.html}, abstract = {Coordinate descent with random coordinate selection is the current state of the art for many large scale optimization problems. However, greedy selection of the steepest coordinate on smooth problems can yield convergence rates independent of the dimension $n$, requiring $n$ times fewer iterations. In this paper, we consider greedy updates that are based on subgradients for a class of non-smooth composite problems, including $L1$-regularized problems, SVMs and related applications. For these problems we provide (i) the first linear rates of convergence independent of $n$, and show that our greedy update rule provides speedups similar to those obtained in the smooth case. This was previously conjectured to be true for a stronger greedy coordinate selection strategy. Furthermore, we show that (ii) our new selection rule can be mapped to instances of maximum inner product search, allowing to leverage standard nearest neighbor algorithms to speed up the implementation. We demonstrate the validity of the approach through extensive numerical experiments.} }
Endnote
%0 Conference Paper %T Efficient Greedy Coordinate Descent for Composite Problems %A Sai Praneeth Karimireddy %A Anastasia Koloskova %A Sebastian U. Stich %A Martin Jaggi %B Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Masashi Sugiyama %F pmlr-v89-karimireddy19a %I PMLR %P 2887--2896 %U http://proceedings.mlr.press/v89/karimireddy19a.html %V 89 %X Coordinate descent with random coordinate selection is the current state of the art for many large scale optimization problems. However, greedy selection of the steepest coordinate on smooth problems can yield convergence rates independent of the dimension $n$, requiring $n$ times fewer iterations. In this paper, we consider greedy updates that are based on subgradients for a class of non-smooth composite problems, including $L1$-regularized problems, SVMs and related applications. For these problems we provide (i) the first linear rates of convergence independent of $n$, and show that our greedy update rule provides speedups similar to those obtained in the smooth case. This was previously conjectured to be true for a stronger greedy coordinate selection strategy. Furthermore, we show that (ii) our new selection rule can be mapped to instances of maximum inner product search, allowing to leverage standard nearest neighbor algorithms to speed up the implementation. We demonstrate the validity of the approach through extensive numerical experiments.
APA
Karimireddy, S.P., Koloskova, A., Stich, S.U. & Jaggi, M.. (2019). Efficient Greedy Coordinate Descent for Composite Problems. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 89:2887-2896 Available from http://proceedings.mlr.press/v89/karimireddy19a.html.

Related Material