Quasi-global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data

Tao Lin, Sai Praneeth Karimireddy, Sebastian Stich, Martin Jaggi
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:6654-6665, 2021.

Abstract

Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks. In realistic learning scenarios, the presence of heterogeneity across different clients’ local datasets poses an optimization challenge and may severely deteriorate the generalization performance. In this paper, we investigate and identify the limitation of several decentralized optimization algorithms for different degrees of data heterogeneity. We propose a novel momentum-based method to mitigate this decentralized training difficulty. We show in extensive empirical experiments on various CV/NLP datasets (CIFAR-10, ImageNet, and AG News) and several network topologies (Ring and Social Network) that our method is much more robust to the heterogeneity of clients’ data than other existing methods, by a significant improvement in test performance (1%-20%).

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-lin21c, title = {Quasi-global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data}, author = {Lin, Tao and Karimireddy, Sai Praneeth and Stich, Sebastian and Jaggi, Martin}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {6654--6665}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/lin21c/lin21c.pdf}, url = {https://proceedings.mlr.press/v139/lin21c.html}, abstract = {Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks. In realistic learning scenarios, the presence of heterogeneity across different clients’ local datasets poses an optimization challenge and may severely deteriorate the generalization performance. In this paper, we investigate and identify the limitation of several decentralized optimization algorithms for different degrees of data heterogeneity. We propose a novel momentum-based method to mitigate this decentralized training difficulty. We show in extensive empirical experiments on various CV/NLP datasets (CIFAR-10, ImageNet, and AG News) and several network topologies (Ring and Social Network) that our method is much more robust to the heterogeneity of clients’ data than other existing methods, by a significant improvement in test performance (1%-20%).} }
Endnote
%0 Conference Paper %T Quasi-global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data %A Tao Lin %A Sai Praneeth Karimireddy %A Sebastian Stich %A Martin Jaggi %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-lin21c %I PMLR %P 6654--6665 %U https://proceedings.mlr.press/v139/lin21c.html %V 139 %X Decentralized training of deep learning models is a key element for enabling data privacy and on-device learning over networks. In realistic learning scenarios, the presence of heterogeneity across different clients’ local datasets poses an optimization challenge and may severely deteriorate the generalization performance. In this paper, we investigate and identify the limitation of several decentralized optimization algorithms for different degrees of data heterogeneity. We propose a novel momentum-based method to mitigate this decentralized training difficulty. We show in extensive empirical experiments on various CV/NLP datasets (CIFAR-10, ImageNet, and AG News) and several network topologies (Ring and Social Network) that our method is much more robust to the heterogeneity of clients’ data than other existing methods, by a significant improvement in test performance (1%-20%).
APA
Lin, T., Karimireddy, S.P., Stich, S. & Jaggi, M.. (2021). Quasi-global Momentum: Accelerating Decentralized Deep Learning on Heterogeneous Data. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:6654-6665 Available from https://proceedings.mlr.press/v139/lin21c.html.

Related Material