Faster Coordinate Descent via Adaptive Importance Sampling

Dmytro Perekrestenko, Volkan Cevher, Martin Jaggi
Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, PMLR 54:869-877, 2017.

Abstract

Coordinate descent methods employ random partial updates of decision variables in order to solve huge-scale convex optimization problems. In this work, we introduce new adaptive rules for the random selection of their updates. By adaptive, we mean that our selection rules are based on the dual residual or the primal-dual gap estimates and can change at each iteration. We theoretically characterize the performance of our selection rules and demonstrate improvements over the state-of-the-art, and extend our theory and algorithms to general convex objectives. Numerical evidence with hinge-loss support vector machines and Lasso confirm that the practice follows the theory.

Cite this Paper


BibTeX
@InProceedings{pmlr-v54-perekrestenko17a, title = {{Faster Coordinate Descent via Adaptive Importance Sampling}}, author = {Perekrestenko, Dmytro and Cevher, Volkan and Jaggi, Martin}, booktitle = {Proceedings of the 20th International Conference on Artificial Intelligence and Statistics}, pages = {869--877}, year = {2017}, editor = {Singh, Aarti and Zhu, Jerry}, volume = {54}, series = {Proceedings of Machine Learning Research}, month = {20--22 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v54/perekrestenko17a/perekrestenko17a.pdf}, url = {https://proceedings.mlr.press/v54/perekrestenko17a.html}, abstract = {Coordinate descent methods employ random partial updates of decision variables in order to solve huge-scale convex optimization problems. In this work, we introduce new adaptive rules for the random selection of their updates. By adaptive, we mean that our selection rules are based on the dual residual or the primal-dual gap estimates and can change at each iteration. We theoretically characterize the performance of our selection rules and demonstrate improvements over the state-of-the-art, and extend our theory and algorithms to general convex objectives. Numerical evidence with hinge-loss support vector machines and Lasso confirm that the practice follows the theory. } }
Endnote
%0 Conference Paper %T Faster Coordinate Descent via Adaptive Importance Sampling %A Dmytro Perekrestenko %A Volkan Cevher %A Martin Jaggi %B Proceedings of the 20th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2017 %E Aarti Singh %E Jerry Zhu %F pmlr-v54-perekrestenko17a %I PMLR %P 869--877 %U https://proceedings.mlr.press/v54/perekrestenko17a.html %V 54 %X Coordinate descent methods employ random partial updates of decision variables in order to solve huge-scale convex optimization problems. In this work, we introduce new adaptive rules for the random selection of their updates. By adaptive, we mean that our selection rules are based on the dual residual or the primal-dual gap estimates and can change at each iteration. We theoretically characterize the performance of our selection rules and demonstrate improvements over the state-of-the-art, and extend our theory and algorithms to general convex objectives. Numerical evidence with hinge-loss support vector machines and Lasso confirm that the practice follows the theory.
APA
Perekrestenko, D., Cevher, V. & Jaggi, M.. (2017). Faster Coordinate Descent via Adaptive Importance Sampling. Proceedings of the 20th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 54:869-877 Available from https://proceedings.mlr.press/v54/perekrestenko17a.html.

Related Material