Alternating Randomized Block Coordinate Descent

Jelena Diakonikolas, Lorenzo Orecchia
; Proceedings of the 35th International Conference on Machine Learning, PMLR 80:1224-1232, 2018.

Abstract

Block-coordinate descent algorithms and alternating minimization methods are fundamental optimization algorithms and an important primitive in large-scale optimization and machine learning. While various block-coordinate-descent-type methods have been studied extensively, only alternating minimization – which applies to the setting of only two blocks – is known to have convergence time that scales independently of the least smooth block. A natural question is then: is the setting of two blocks special? We show that the answer is “no” as long as the least smooth block can be optimized exactly – an assumption that is also needed in the setting of alternating minimization. We do so by introducing a novel algorithm AR-BCD, whose convergence time scales independently of the least smooth (possibly non-smooth) block. The basic algorithm generalizes both alternating minimization and randomized block coordinate (gradient) descent, and we also provide its accelerated version – AAR-BCD.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-diakonikolas18a, title = {Alternating Randomized Block Coordinate Descent}, author = {Diakonikolas, Jelena and Orecchia, Lorenzo}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {1224--1232}, year = {2018}, editor = {Jennifer Dy and Andreas Krause}, volume = {80}, series = {Proceedings of Machine Learning Research}, address = {Stockholmsmässan, Stockholm Sweden}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/diakonikolas18a/diakonikolas18a.pdf}, url = {http://proceedings.mlr.press/v80/diakonikolas18a.html}, abstract = {Block-coordinate descent algorithms and alternating minimization methods are fundamental optimization algorithms and an important primitive in large-scale optimization and machine learning. While various block-coordinate-descent-type methods have been studied extensively, only alternating minimization – which applies to the setting of only two blocks – is known to have convergence time that scales independently of the least smooth block. A natural question is then: is the setting of two blocks special? We show that the answer is “no” as long as the least smooth block can be optimized exactly – an assumption that is also needed in the setting of alternating minimization. We do so by introducing a novel algorithm AR-BCD, whose convergence time scales independently of the least smooth (possibly non-smooth) block. The basic algorithm generalizes both alternating minimization and randomized block coordinate (gradient) descent, and we also provide its accelerated version – AAR-BCD.} }
Endnote
%0 Conference Paper %T Alternating Randomized Block Coordinate Descent %A Jelena Diakonikolas %A Lorenzo Orecchia %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-diakonikolas18a %I PMLR %J Proceedings of Machine Learning Research %P 1224--1232 %U http://proceedings.mlr.press %V 80 %W PMLR %X Block-coordinate descent algorithms and alternating minimization methods are fundamental optimization algorithms and an important primitive in large-scale optimization and machine learning. While various block-coordinate-descent-type methods have been studied extensively, only alternating minimization – which applies to the setting of only two blocks – is known to have convergence time that scales independently of the least smooth block. A natural question is then: is the setting of two blocks special? We show that the answer is “no” as long as the least smooth block can be optimized exactly – an assumption that is also needed in the setting of alternating minimization. We do so by introducing a novel algorithm AR-BCD, whose convergence time scales independently of the least smooth (possibly non-smooth) block. The basic algorithm generalizes both alternating minimization and randomized block coordinate (gradient) descent, and we also provide its accelerated version – AAR-BCD.
APA
Diakonikolas, J. & Orecchia, L.. (2018). Alternating Randomized Block Coordinate Descent. Proceedings of the 35th International Conference on Machine Learning, in PMLR 80:1224-1232

Related Material