Optimal Mini-Batch and Step Sizes for SAGA

Nidham Gazagnadou, Robert Gower, Joseph Salmon
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:2142-2150, 2019.

Abstract

Recently it has been shown that the step sizes of a family of variance reduced gradient methods called the JacSketch methods depend on the expected smoothness constant. In particular, if this expected smoothness constant could be calculated a priori, then one could safely set much larger step sizes which would result in a much faster convergence rate. We fill in this gap, and provide simple closed form expressions for the expected smoothness constant and careful numerical experiments verifying these bounds. Using these bounds, and since the SAGA algorithm is part of this JacSketch family, we suggest a new standard practice for setting the step and mini-batch sizes for SAGA that are competitive with a numerical grid search. Furthermore, we can now show that the total complexity of the SAGA algorithm decreases linearly in the mini-batch size up to a pre-defined value: the optimal mini-batch size. This is a rare result in the stochastic variance reduced literature, only previously shown for the Katyusha algorithm. Finally we conjecture that this is the case for many other stochastic variance reduced methods and that our bounds and analysis of the expected smoothness constant is key to extending these results.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-gazagnadou19a, title = {Optimal Mini-Batch and Step Sizes for {SAGA}}, author = {Gazagnadou, Nidham and Gower, Robert and Salmon, Joseph}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {2142--2150}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/gazagnadou19a/gazagnadou19a.pdf}, url = {https://proceedings.mlr.press/v97/gazagnadou19a.html}, abstract = {Recently it has been shown that the step sizes of a family of variance reduced gradient methods called the JacSketch methods depend on the expected smoothness constant. In particular, if this expected smoothness constant could be calculated a priori, then one could safely set much larger step sizes which would result in a much faster convergence rate. We fill in this gap, and provide simple closed form expressions for the expected smoothness constant and careful numerical experiments verifying these bounds. Using these bounds, and since the SAGA algorithm is part of this JacSketch family, we suggest a new standard practice for setting the step and mini-batch sizes for SAGA that are competitive with a numerical grid search. Furthermore, we can now show that the total complexity of the SAGA algorithm decreases linearly in the mini-batch size up to a pre-defined value: the optimal mini-batch size. This is a rare result in the stochastic variance reduced literature, only previously shown for the Katyusha algorithm. Finally we conjecture that this is the case for many other stochastic variance reduced methods and that our bounds and analysis of the expected smoothness constant is key to extending these results.} }
Endnote
%0 Conference Paper %T Optimal Mini-Batch and Step Sizes for SAGA %A Nidham Gazagnadou %A Robert Gower %A Joseph Salmon %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-gazagnadou19a %I PMLR %P 2142--2150 %U https://proceedings.mlr.press/v97/gazagnadou19a.html %V 97 %X Recently it has been shown that the step sizes of a family of variance reduced gradient methods called the JacSketch methods depend on the expected smoothness constant. In particular, if this expected smoothness constant could be calculated a priori, then one could safely set much larger step sizes which would result in a much faster convergence rate. We fill in this gap, and provide simple closed form expressions for the expected smoothness constant and careful numerical experiments verifying these bounds. Using these bounds, and since the SAGA algorithm is part of this JacSketch family, we suggest a new standard practice for setting the step and mini-batch sizes for SAGA that are competitive with a numerical grid search. Furthermore, we can now show that the total complexity of the SAGA algorithm decreases linearly in the mini-batch size up to a pre-defined value: the optimal mini-batch size. This is a rare result in the stochastic variance reduced literature, only previously shown for the Katyusha algorithm. Finally we conjecture that this is the case for many other stochastic variance reduced methods and that our bounds and analysis of the expected smoothness constant is key to extending these results.
APA
Gazagnadou, N., Gower, R. & Salmon, J.. (2019). Optimal Mini-Batch and Step Sizes for SAGA. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:2142-2150 Available from https://proceedings.mlr.press/v97/gazagnadou19a.html.

Related Material