Understanding the Impact of Model Incoherence on Convergence of Incremental SGD with Random Reshuffle

Shaocong Ma, Yi Zhou
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:6565-6574, 2020.

Abstract

Although SGD with random reshuffle has been widely-used in machine learning applications, there is a limited understanding of how model characteristics affect the convergence of the algorithm. In this work, we introduce model incoherence to characterize the diversity of model characteristics and study its impact on convergence of SGD with random reshuffle under weak strong convexity. Specifically, minimizer incoherence measures the discrepancy between the global minimizers of a sample loss and those of the total loss and affects the convergence error of SGD with random reshuffle. In particular, we show that the variable sequence generated by SGD with random reshuffle converges to a certain global minimizer of the total loss under full minimizer coherence. The other curvature incoherence measures the quality of condition numbers of the sample losses and determines the convergence rate of SGD. With model incoherence, our results show that SGD has a faster convergence rate and smaller convergence error under random reshuffle than those under random sampling, and hence provide justifications to the superior practical performance of SGD with random reshuffle.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-ma20e, title = {Understanding the Impact of Model Incoherence on Convergence of Incremental {SGD} with Random Reshuffle}, author = {Ma, Shaocong and Zhou, Yi}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {6565--6574}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/ma20e/ma20e.pdf}, url = {https://proceedings.mlr.press/v119/ma20e.html}, abstract = {Although SGD with random reshuffle has been widely-used in machine learning applications, there is a limited understanding of how model characteristics affect the convergence of the algorithm. In this work, we introduce model incoherence to characterize the diversity of model characteristics and study its impact on convergence of SGD with random reshuffle under weak strong convexity. Specifically, minimizer incoherence measures the discrepancy between the global minimizers of a sample loss and those of the total loss and affects the convergence error of SGD with random reshuffle. In particular, we show that the variable sequence generated by SGD with random reshuffle converges to a certain global minimizer of the total loss under full minimizer coherence. The other curvature incoherence measures the quality of condition numbers of the sample losses and determines the convergence rate of SGD. With model incoherence, our results show that SGD has a faster convergence rate and smaller convergence error under random reshuffle than those under random sampling, and hence provide justifications to the superior practical performance of SGD with random reshuffle.} }
Endnote
%0 Conference Paper %T Understanding the Impact of Model Incoherence on Convergence of Incremental SGD with Random Reshuffle %A Shaocong Ma %A Yi Zhou %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-ma20e %I PMLR %P 6565--6574 %U https://proceedings.mlr.press/v119/ma20e.html %V 119 %X Although SGD with random reshuffle has been widely-used in machine learning applications, there is a limited understanding of how model characteristics affect the convergence of the algorithm. In this work, we introduce model incoherence to characterize the diversity of model characteristics and study its impact on convergence of SGD with random reshuffle under weak strong convexity. Specifically, minimizer incoherence measures the discrepancy between the global minimizers of a sample loss and those of the total loss and affects the convergence error of SGD with random reshuffle. In particular, we show that the variable sequence generated by SGD with random reshuffle converges to a certain global minimizer of the total loss under full minimizer coherence. The other curvature incoherence measures the quality of condition numbers of the sample losses and determines the convergence rate of SGD. With model incoherence, our results show that SGD has a faster convergence rate and smaller convergence error under random reshuffle than those under random sampling, and hence provide justifications to the superior practical performance of SGD with random reshuffle.
APA
Ma, S. & Zhou, Y.. (2020). Understanding the Impact of Model Incoherence on Convergence of Incremental SGD with Random Reshuffle. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:6565-6574 Available from https://proceedings.mlr.press/v119/ma20e.html.

Related Material