Communication-Efficient Distributed Optimization in Networks with Gradient Tracking and Variance Reduction

Boyue Li, Shicong Cen, Yuxin Chen, Yuejie Chi
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:1662-1672, 2020.

Abstract

Due to the imminent need to alleviate the communication burden in multi-agent and federated learning, the investigation of communication-efficient distributed optimization algorithms for empirical risk minimization has flourished recently. A large fraction of existing algorithms are developed for the master/slave setting, relying on the presence of a central parameter server. This paper focuses on distributed optimization in the network setting (also known as the decentralized setting), where each agent is only allowed to aggregate information from its neighbors over a graph. By properly adjusting the global gradient estimate via a tracking term, we first develop a communication-efficient approximate Newton-type method, called Network-DANE, which generalizes the attractive DANE algorithm to decentralized networks. Our key algorithmic ideas can be applied, in a systematic manner, to obtain decentralized versions of other master/slave distributed algorithms. Notably, we develop Network-SVRG/SARAH, which employ stochastic variance reduction at each agent to accelerate local computations. We establish linear convergence of Network-DANE and Network-SVRG for strongly convex losses, and Network-SARAH for quadratic losses, which shed light on the impact of data homogeneity, network connectivity, and local averaging upon the rate of convergence. Numerical evidence is provided to demonstrate the appealing performance of our algorithms over competitive baselines, in terms of both communication and computation efficiency.

Cite this Paper


BibTeX
@InProceedings{pmlr-v108-li20f, title = {Communication-Efficient Distributed Optimization in Networks with Gradient Tracking and Variance Reduction}, author = {Li, Boyue and Cen, Shicong and Chen, Yuxin and Chi, Yuejie}, booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics}, pages = {1662--1672}, year = {2020}, editor = {Silvia Chiappa and Roberto Calandra}, volume = {108}, series = {Proceedings of Machine Learning Research}, month = {26--28 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v108/li20f/li20f.pdf}, url = { http://proceedings.mlr.press/v108/li20f.html }, abstract = {Due to the imminent need to alleviate the communication burden in multi-agent and federated learning, the investigation of communication-efficient distributed optimization algorithms for empirical risk minimization has flourished recently. A large fraction of existing algorithms are developed for the master/slave setting, relying on the presence of a central parameter server. This paper focuses on distributed optimization in the network setting (also known as the decentralized setting), where each agent is only allowed to aggregate information from its neighbors over a graph. By properly adjusting the global gradient estimate via a tracking term, we first develop a communication-efficient approximate Newton-type method, called Network-DANE, which generalizes the attractive DANE algorithm to decentralized networks. Our key algorithmic ideas can be applied, in a systematic manner, to obtain decentralized versions of other master/slave distributed algorithms. Notably, we develop Network-SVRG/SARAH, which employ stochastic variance reduction at each agent to accelerate local computations. We establish linear convergence of Network-DANE and Network-SVRG for strongly convex losses, and Network-SARAH for quadratic losses, which shed light on the impact of data homogeneity, network connectivity, and local averaging upon the rate of convergence. Numerical evidence is provided to demonstrate the appealing performance of our algorithms over competitive baselines, in terms of both communication and computation efficiency.} }
Endnote
%0 Conference Paper %T Communication-Efficient Distributed Optimization in Networks with Gradient Tracking and Variance Reduction %A Boyue Li %A Shicong Cen %A Yuxin Chen %A Yuejie Chi %B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2020 %E Silvia Chiappa %E Roberto Calandra %F pmlr-v108-li20f %I PMLR %P 1662--1672 %U http://proceedings.mlr.press/v108/li20f.html %V 108 %X Due to the imminent need to alleviate the communication burden in multi-agent and federated learning, the investigation of communication-efficient distributed optimization algorithms for empirical risk minimization has flourished recently. A large fraction of existing algorithms are developed for the master/slave setting, relying on the presence of a central parameter server. This paper focuses on distributed optimization in the network setting (also known as the decentralized setting), where each agent is only allowed to aggregate information from its neighbors over a graph. By properly adjusting the global gradient estimate via a tracking term, we first develop a communication-efficient approximate Newton-type method, called Network-DANE, which generalizes the attractive DANE algorithm to decentralized networks. Our key algorithmic ideas can be applied, in a systematic manner, to obtain decentralized versions of other master/slave distributed algorithms. Notably, we develop Network-SVRG/SARAH, which employ stochastic variance reduction at each agent to accelerate local computations. We establish linear convergence of Network-DANE and Network-SVRG for strongly convex losses, and Network-SARAH for quadratic losses, which shed light on the impact of data homogeneity, network connectivity, and local averaging upon the rate of convergence. Numerical evidence is provided to demonstrate the appealing performance of our algorithms over competitive baselines, in terms of both communication and computation efficiency.
APA
Li, B., Cen, S., Chen, Y. & Chi, Y.. (2020). Communication-Efficient Distributed Optimization in Networks with Gradient Tracking and Variance Reduction. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:1662-1672 Available from http://proceedings.mlr.press/v108/li20f.html .

Related Material