Breaking Locality Accelerates Block Gauss-Seidel

Stephen Tu; Shivaram Venkataraman; Ashia C. Wilson; Alex Gittens; Michael I. Jordan; Benjamin Recht

Breaking Locality Accelerates Block Gauss-Seidel

Stephen Tu, Shivaram Venkataraman, Ashia C. Wilson, Alex Gittens, Michael I. Jordan, Benjamin Recht

Proceedings of the 34th International Conference on Machine Learning, PMLR 70:3482-3491, 2017.

Abstract

Recent work by Nesterov and Stich (2016) showed that momentum can be used to accelerate the rate of convergence for block Gauss-Seidel in the setting where a fixed partitioning of the coordinates is chosen ahead of time. We show that this setting is too restrictive, constructing instances where breaking locality by running non-accelerated Gauss-Seidel with randomly sampled coordinates substantially outperforms accelerated Gauss-Seidel with any fixed partitioning. Motivated by this finding, we analyze the accelerated block Gauss-Seidel algorithm in the random coordinate sampling setting. Our analysis captures the benefit of acceleration with a new data-dependent parameter which is well behaved when the matrix sub-blocks are well-conditioned. Empirically, we show that accelerated Gauss-Seidel with random coordinate sampling provides speedups for large scale machine learning tasks when compared to non-accelerated Gauss-Seidel and the classical conjugate-gradient algorithm.

Cite this Paper

BibTeX


@InProceedings{pmlr-v70-tu17a,
  title = 	 {Breaking Locality Accelerates Block {G}auss-{S}eidel},
  author =       {Stephen Tu and Shivaram Venkataraman and Ashia C. Wilson and Alex Gittens and Michael I. Jordan and Benjamin Recht},
  booktitle = 	 {Proceedings of the 34th International Conference on Machine Learning},
  pages = 	 {3482--3491},
  year = 	 {2017},
  editor = 	 {Precup, Doina and Teh, Yee Whye},
  volume = 	 {70},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--11 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v70/tu17a/tu17a.pdf},
  url = 	 {https://proceedings.mlr.press/v70/tu17a.html},
  abstract = 	 {Recent work by Nesterov and Stich (2016) showed that momentum can be used to accelerate the rate of convergence for block Gauss-Seidel in the setting where a fixed partitioning of the coordinates is chosen ahead of time. We show that this setting is too restrictive, constructing instances where breaking locality by running non-accelerated Gauss-Seidel with randomly sampled coordinates substantially outperforms accelerated Gauss-Seidel with any fixed partitioning. Motivated by this finding, we analyze the accelerated block Gauss-Seidel algorithm in the random coordinate sampling setting. Our analysis captures the benefit of acceleration with a new data-dependent parameter which is well behaved when the matrix sub-blocks are well-conditioned. Empirically, we show that accelerated Gauss-Seidel with random coordinate sampling provides speedups for large scale machine learning tasks when compared to non-accelerated Gauss-Seidel and the classical conjugate-gradient algorithm.}
}

Endnote

%0 Conference Paper
%T Breaking Locality Accelerates Block Gauss-Seidel
%A Stephen Tu
%A Shivaram Venkataraman
%A Ashia C. Wilson
%A Alex Gittens
%A Michael I. Jordan
%A Benjamin Recht
%B Proceedings of the 34th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2017
%E Doina Precup
%E Yee Whye Teh	
%F pmlr-v70-tu17a
%I PMLR
%P 3482--3491
%U https://proceedings.mlr.press/v70/tu17a.html
%V 70
%X Recent work by Nesterov and Stich (2016) showed that momentum can be used to accelerate the rate of convergence for block Gauss-Seidel in the setting where a fixed partitioning of the coordinates is chosen ahead of time. We show that this setting is too restrictive, constructing instances where breaking locality by running non-accelerated Gauss-Seidel with randomly sampled coordinates substantially outperforms accelerated Gauss-Seidel with any fixed partitioning. Motivated by this finding, we analyze the accelerated block Gauss-Seidel algorithm in the random coordinate sampling setting. Our analysis captures the benefit of acceleration with a new data-dependent parameter which is well behaved when the matrix sub-blocks are well-conditioned. Empirically, we show that accelerated Gauss-Seidel with random coordinate sampling provides speedups for large scale machine learning tasks when compared to non-accelerated Gauss-Seidel and the classical conjugate-gradient algorithm.

APA


Tu, S., Venkataraman, S., Wilson, A.C., Gittens, A., Jordan, M.I. & Recht, B.. (2017). Breaking Locality Accelerates Block Gauss-Seidel. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:3482-3491 Available from https://proceedings.mlr.press/v70/tu17a.html.

Related Material

Download PDF