Breaking Locality Accelerates Block Gauss-Seidel

Stephen Tu, Shivaram Venkataraman, Ashia C. Wilson, Alex Gittens, Michael I. Jordan, Benjamin Recht
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:3482-3491, 2017.

Abstract

Recent work by Nesterov and Stich (2016) showed that momentum can be used to accelerate the rate of convergence for block Gauss-Seidel in the setting where a fixed partitioning of the coordinates is chosen ahead of time. We show that this setting is too restrictive, constructing instances where breaking locality by running non-accelerated Gauss-Seidel with randomly sampled coordinates substantially outperforms accelerated Gauss-Seidel with any fixed partitioning. Motivated by this finding, we analyze the accelerated block Gauss-Seidel algorithm in the random coordinate sampling setting. Our analysis captures the benefit of acceleration with a new data-dependent parameter which is well behaved when the matrix sub-blocks are well-conditioned. Empirically, we show that accelerated Gauss-Seidel with random coordinate sampling provides speedups for large scale machine learning tasks when compared to non-accelerated Gauss-Seidel and the classical conjugate-gradient algorithm.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-tu17a, title = {Breaking Locality Accelerates Block {G}auss-{S}eidel}, author = {Stephen Tu and Shivaram Venkataraman and Ashia C. Wilson and Alex Gittens and Michael I. Jordan and Benjamin Recht}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {3482--3491}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/tu17a/tu17a.pdf}, url = {https://proceedings.mlr.press/v70/tu17a.html}, abstract = {Recent work by Nesterov and Stich (2016) showed that momentum can be used to accelerate the rate of convergence for block Gauss-Seidel in the setting where a fixed partitioning of the coordinates is chosen ahead of time. We show that this setting is too restrictive, constructing instances where breaking locality by running non-accelerated Gauss-Seidel with randomly sampled coordinates substantially outperforms accelerated Gauss-Seidel with any fixed partitioning. Motivated by this finding, we analyze the accelerated block Gauss-Seidel algorithm in the random coordinate sampling setting. Our analysis captures the benefit of acceleration with a new data-dependent parameter which is well behaved when the matrix sub-blocks are well-conditioned. Empirically, we show that accelerated Gauss-Seidel with random coordinate sampling provides speedups for large scale machine learning tasks when compared to non-accelerated Gauss-Seidel and the classical conjugate-gradient algorithm.} }
Endnote
%0 Conference Paper %T Breaking Locality Accelerates Block Gauss-Seidel %A Stephen Tu %A Shivaram Venkataraman %A Ashia C. Wilson %A Alex Gittens %A Michael I. Jordan %A Benjamin Recht %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-tu17a %I PMLR %P 3482--3491 %U https://proceedings.mlr.press/v70/tu17a.html %V 70 %X Recent work by Nesterov and Stich (2016) showed that momentum can be used to accelerate the rate of convergence for block Gauss-Seidel in the setting where a fixed partitioning of the coordinates is chosen ahead of time. We show that this setting is too restrictive, constructing instances where breaking locality by running non-accelerated Gauss-Seidel with randomly sampled coordinates substantially outperforms accelerated Gauss-Seidel with any fixed partitioning. Motivated by this finding, we analyze the accelerated block Gauss-Seidel algorithm in the random coordinate sampling setting. Our analysis captures the benefit of acceleration with a new data-dependent parameter which is well behaved when the matrix sub-blocks are well-conditioned. Empirically, we show that accelerated Gauss-Seidel with random coordinate sampling provides speedups for large scale machine learning tasks when compared to non-accelerated Gauss-Seidel and the classical conjugate-gradient algorithm.
APA
Tu, S., Venkataraman, S., Wilson, A.C., Gittens, A., Jordan, M.I. & Recht, B.. (2017). Breaking Locality Accelerates Block Gauss-Seidel. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:3482-3491 Available from https://proceedings.mlr.press/v70/tu17a.html.

Related Material