Toward a Noncommutative Arithmetic-geometric Mean Inequality: Conjectures, Case-studies, and Consequences

Benjamin Recht; Christopher Re

Toward a Noncommutative Arithmetic-geometric Mean Inequality: Conjectures, Case-studies, and Consequences

Benjamin Recht, Christopher Re

Proceedings of the 25th Annual Conference on Learning Theory, PMLR 23:11.1-11.24, 2012.

Abstract

Randomized algorithms that base iteration-level decisions on samples from some pool are ubiquitous in machine learning and optimization. Examples include stochastic gradient descent and randomized coordinate descent. This paper makes progress at theoretically evaluating the difference in performance between sampling with- and without-replacement in such algorithms. Focusing on least means squares optimization, we formulate a noncommutative arithmetic-geometric mean inequality that would prove that the expected convergence rate of without-replacement sampling is faster than that of with-replacement sampling. We demonstrate that this inequality holds for many classes of random matrices and for some pathological examples as well. We provide a deterministic worst-case bound on the gap between the discrepancy between the two sampling models, and explore some of the impediments to proving this inequality in full generality. We detail the consequences of this inequality for stochastic gradient descent and the randomized Kaczmarz algorithm for solving linear systems.

Cite this Paper

BibTeX


@InProceedings{pmlr-v23-recht12,
  title = 	 {Toward a Noncommutative Arithmetic-geometric Mean Inequality: Conjectures, Case-studies, and Consequences},
  author = 	 {Recht, Benjamin and Re, Christopher},
  booktitle = 	 {Proceedings of the 25th Annual Conference on Learning Theory},
  pages = 	 {11.1--11.24},
  year = 	 {2012},
  editor = 	 {Mannor, Shie and Srebro, Nathan and Williamson, Robert C.},
  volume = 	 {23},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Edinburgh, Scotland},
  month = 	 {25--27 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v23/recht12/recht12.pdf},
  url = 	 {https://proceedings.mlr.press/v23/recht12.html},
  abstract = 	 {Randomized algorithms that base iteration-level decisions on samples from some pool are ubiquitous in machine learning and optimization. Examples include stochastic gradient descent and randomized coordinate descent. This paper makes progress at theoretically evaluating the difference in performance between sampling with- and without-replacement in such algorithms. Focusing on least means squares optimization, we formulate a noncommutative arithmetic-geometric mean inequality that would prove that the expected convergence rate of without-replacement sampling is faster than that of with-replacement sampling. We demonstrate that this inequality holds for many classes of random matrices and for some pathological examples as well. We provide a deterministic worst-case bound on the gap between the discrepancy between the two sampling models, and explore some of the impediments to proving this inequality in full generality. We detail the consequences of this inequality for stochastic gradient descent and the randomized Kaczmarz algorithm for solving linear systems.}
}

Endnote

%0 Conference Paper
%T Toward a Noncommutative Arithmetic-geometric Mean Inequality: Conjectures, Case-studies, and Consequences
%A Benjamin Recht
%A Christopher Re
%B Proceedings of the 25th Annual Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2012
%E Shie Mannor
%E Nathan Srebro
%E Robert C. Williamson	
%F pmlr-v23-recht12
%I PMLR
%P 11.1--11.24
%U https://proceedings.mlr.press/v23/recht12.html
%V 23
%X Randomized algorithms that base iteration-level decisions on samples from some pool are ubiquitous in machine learning and optimization. Examples include stochastic gradient descent and randomized coordinate descent. This paper makes progress at theoretically evaluating the difference in performance between sampling with- and without-replacement in such algorithms. Focusing on least means squares optimization, we formulate a noncommutative arithmetic-geometric mean inequality that would prove that the expected convergence rate of without-replacement sampling is faster than that of with-replacement sampling. We demonstrate that this inequality holds for many classes of random matrices and for some pathological examples as well. We provide a deterministic worst-case bound on the gap between the discrepancy between the two sampling models, and explore some of the impediments to proving this inequality in full generality. We detail the consequences of this inequality for stochastic gradient descent and the randomized Kaczmarz algorithm for solving linear systems.

RIS


TY  - CPAPER
TI  - Toward a Noncommutative Arithmetic-geometric Mean Inequality: Conjectures, Case-studies, and Consequences
AU  - Benjamin Recht
AU  - Christopher Re
BT  - Proceedings of the 25th Annual Conference on Learning Theory
DA  - 2012/06/16
ED  - Shie Mannor
ED  - Nathan Srebro
ED  - Robert C. Williamson	
ID  - pmlr-v23-recht12
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 23
SP  - 11.1
EP  - 11.24
L1  - http://proceedings.mlr.press/v23/recht12/recht12.pdf
UR  - https://proceedings.mlr.press/v23/recht12.html
AB  - Randomized algorithms that base iteration-level decisions on samples from some pool are ubiquitous in machine learning and optimization. Examples include stochastic gradient descent and randomized coordinate descent. This paper makes progress at theoretically evaluating the difference in performance between sampling with- and without-replacement in such algorithms. Focusing on least means squares optimization, we formulate a noncommutative arithmetic-geometric mean inequality that would prove that the expected convergence rate of without-replacement sampling is faster than that of with-replacement sampling. We demonstrate that this inequality holds for many classes of random matrices and for some pathological examples as well. We provide a deterministic worst-case bound on the gap between the discrepancy between the two sampling models, and explore some of the impediments to proving this inequality in full generality. We detail the consequences of this inequality for stochastic gradient descent and the randomized Kaczmarz algorithm for solving linear systems.
ER  -

APA


Recht, B. & Re, C.. (2012). Toward a Noncommutative Arithmetic-geometric Mean Inequality: Conjectures, Case-studies, and Consequences. Proceedings of the 25th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 23:11.1-11.24 Available from https://proceedings.mlr.press/v23/recht12.html.

Related Material

Download PDF