Cross-Gradient Aggregation for Decentralized Learning from Non-IID Data

Yasaman Esfandiari; Sin Yong Tan; Zhanhong Jiang; Aditya Balu; Ethan Herron; Chinmay Hegde; Soumik Sarkar

Cross-Gradient Aggregation for Decentralized Learning from Non-IID Data

Yasaman Esfandiari, Sin Yong Tan, Zhanhong Jiang, Aditya Balu, Ethan Herron, Chinmay Hegde, Soumik Sarkar

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:3036-3046, 2021.

Abstract

Decentralized learning enables a group of collaborative agents to learn models using a distributed dataset without the need for a central parameter server. Recently, decentralized learning algorithms have demonstrated state-of-the-art results on benchmark data sets, comparable with centralized algorithms. However, the key assumption to achieve competitive performance is that the data is independently and identically distributed (IID) among the agents which, in real-life applications, is often not applicable. Inspired by ideas from continual learning, we propose Cross-Gradient Aggregation (CGA), a novel decentralized learning algorithm where (i) each agent aggregates cross-gradient information, i.e., derivatives of its model with respect to its neighbors’ datasets, and (ii) updates its model using a projected gradient based on quadratic programming (QP). We theoretically analyze the convergence characteristics of CGA and demonstrate its efficiency on non-IID data distributions sampled from the MNIST and CIFAR-10 datasets. Our empirical comparisons show superior learning performance of CGA over existing state-of-the-art decentralized learning algorithms, as well as maintaining the improved performance under information compression to reduce peer-to-peer communication overhead. The code is available here on GitHub.

Cite this Paper

BibTeX


@InProceedings{pmlr-v139-esfandiari21a,
  title = 	 {Cross-Gradient Aggregation for Decentralized Learning from Non-IID Data},
  author =       {Esfandiari, Yasaman and Tan, Sin Yong and Jiang, Zhanhong and Balu, Aditya and Herron, Ethan and Hegde, Chinmay and Sarkar, Soumik},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {3036--3046},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/esfandiari21a/esfandiari21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/esfandiari21a.html},
  abstract = 	 {Decentralized learning enables a group of collaborative agents to learn models using a distributed dataset without the need for a central parameter server. Recently, decentralized learning algorithms have demonstrated state-of-the-art results on benchmark data sets, comparable with centralized algorithms. However, the key assumption to achieve competitive performance is that the data is independently and identically distributed (IID) among the agents which, in real-life applications, is often not applicable. Inspired by ideas from continual learning, we propose Cross-Gradient Aggregation (CGA), a novel decentralized learning algorithm where (i) each agent aggregates cross-gradient information, i.e., derivatives of its model with respect to its neighbors’ datasets, and (ii) updates its model using a projected gradient based on quadratic programming (QP). We theoretically analyze the convergence characteristics of CGA and demonstrate its efficiency on non-IID data distributions sampled from the MNIST and CIFAR-10 datasets. Our empirical comparisons show superior learning performance of CGA over existing state-of-the-art decentralized learning algorithms, as well as maintaining the improved performance under information compression to reduce peer-to-peer communication overhead. The code is available here on GitHub.}
}

Endnote

%0 Conference Paper
%T Cross-Gradient Aggregation for Decentralized Learning from Non-IID Data
%A Yasaman Esfandiari
%A Sin Yong Tan
%A Zhanhong Jiang
%A Aditya Balu
%A Ethan Herron
%A Chinmay Hegde
%A Soumik Sarkar
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-esfandiari21a
%I PMLR
%P 3036--3046
%U https://proceedings.mlr.press/v139/esfandiari21a.html
%V 139
%X Decentralized learning enables a group of collaborative agents to learn models using a distributed dataset without the need for a central parameter server. Recently, decentralized learning algorithms have demonstrated state-of-the-art results on benchmark data sets, comparable with centralized algorithms. However, the key assumption to achieve competitive performance is that the data is independently and identically distributed (IID) among the agents which, in real-life applications, is often not applicable. Inspired by ideas from continual learning, we propose Cross-Gradient Aggregation (CGA), a novel decentralized learning algorithm where (i) each agent aggregates cross-gradient information, i.e., derivatives of its model with respect to its neighbors’ datasets, and (ii) updates its model using a projected gradient based on quadratic programming (QP). We theoretically analyze the convergence characteristics of CGA and demonstrate its efficiency on non-IID data distributions sampled from the MNIST and CIFAR-10 datasets. Our empirical comparisons show superior learning performance of CGA over existing state-of-the-art decentralized learning algorithms, as well as maintaining the improved performance under information compression to reduce peer-to-peer communication overhead. The code is available here on GitHub.

APA


Esfandiari, Y., Tan, S.Y., Jiang, Z., Balu, A., Herron, E., Hegde, C. & Sarkar, S.. (2021). Cross-Gradient Aggregation for Decentralized Learning from Non-IID Data. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:3036-3046 Available from https://proceedings.mlr.press/v139/esfandiari21a.html.

Cross-Gradient Aggregation for Decentralized Learning from Non-IID Data

Abstract

Cite this Paper

Related Material