Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond

Jaeyoung Cha, Jaewook Lee, Chulhee Yun
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:3855-3912, 2023.

Abstract

We study convergence lower bounds of without-replacement stochastic gradient descent (SGD) for solving smooth (strongly-)convex finite-sum minimization problems. Unlike most existing results focusing on final iterate lower bounds in terms of the number of components n and the number of epochs K, we seek bounds for arbitrary weighted average iterates that are tight in all factors including the condition number κ. For SGD with Random Reshuffling, we present lower bounds that have tighter κ dependencies than existing bounds. Our results are the first to perfectly close the gap between lower and upper bounds for weighted average iterates in both strongly-convex and convex cases. We also prove weighted average iterate lower bounds for arbitrary permutation-based SGD, which apply to all variants that carefully choose the best permutation. Our bounds improve the existing bounds in factors of n and κ and thereby match the upper bounds shown for a recently proposed algorithm called GraB.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-cha23a, title = {Tighter Lower Bounds for Shuffling {SGD}: Random Permutations and Beyond}, author = {Cha, Jaeyoung and Lee, Jaewook and Yun, Chulhee}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {3855--3912}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/cha23a/cha23a.pdf}, url = {https://proceedings.mlr.press/v202/cha23a.html}, abstract = {We study convergence lower bounds of without-replacement stochastic gradient descent (SGD) for solving smooth (strongly-)convex finite-sum minimization problems. Unlike most existing results focusing on final iterate lower bounds in terms of the number of components $n$ and the number of epochs $K$, we seek bounds for arbitrary weighted average iterates that are tight in all factors including the condition number $\kappa$. For SGD with Random Reshuffling, we present lower bounds that have tighter $\kappa$ dependencies than existing bounds. Our results are the first to perfectly close the gap between lower and upper bounds for weighted average iterates in both strongly-convex and convex cases. We also prove weighted average iterate lower bounds for arbitrary permutation-based SGD, which apply to all variants that carefully choose the best permutation. Our bounds improve the existing bounds in factors of $n$ and $\kappa$ and thereby match the upper bounds shown for a recently proposed algorithm called GraB.} }
Endnote
%0 Conference Paper %T Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond %A Jaeyoung Cha %A Jaewook Lee %A Chulhee Yun %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-cha23a %I PMLR %P 3855--3912 %U https://proceedings.mlr.press/v202/cha23a.html %V 202 %X We study convergence lower bounds of without-replacement stochastic gradient descent (SGD) for solving smooth (strongly-)convex finite-sum minimization problems. Unlike most existing results focusing on final iterate lower bounds in terms of the number of components $n$ and the number of epochs $K$, we seek bounds for arbitrary weighted average iterates that are tight in all factors including the condition number $\kappa$. For SGD with Random Reshuffling, we present lower bounds that have tighter $\kappa$ dependencies than existing bounds. Our results are the first to perfectly close the gap between lower and upper bounds for weighted average iterates in both strongly-convex and convex cases. We also prove weighted average iterate lower bounds for arbitrary permutation-based SGD, which apply to all variants that carefully choose the best permutation. Our bounds improve the existing bounds in factors of $n$ and $\kappa$ and thereby match the upper bounds shown for a recently proposed algorithm called GraB.
APA
Cha, J., Lee, J. & Yun, C.. (2023). Tighter Lower Bounds for Shuffling SGD: Random Permutations and Beyond. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:3855-3912 Available from https://proceedings.mlr.press/v202/cha23a.html.

Related Material