CommunicationEfficient Distributed Optimization in Networks with Gradient Tracking and Variance Reduction
[edit]
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:16621672, 2020.
Abstract
Due to the imminent need to alleviate the communication burden in multiagent and federated learning, the investigation of communicationefficient distributed optimization algorithms for empirical risk minimization has flourished recently. A large fraction of existing algorithms are developed for the master/slave setting, relying on the presence of a central parameter server. This paper focuses on distributed optimization in the network setting (also known as the decentralized setting), where each agent is only allowed to aggregate information from its neighbors over a graph. By properly adjusting the global gradient estimate via a tracking term, we first develop a communicationefficient approximate Newtontype method, called NetworkDANE, which generalizes the attractive DANE algorithm to decentralized networks. Our key algorithmic ideas can be applied, in a systematic manner, to obtain decentralized versions of other master/slave distributed algorithms. Notably, we develop NetworkSVRG/SARAH, which employ stochastic variance reduction at each agent to accelerate local computations. We establish linear convergence of NetworkDANE and NetworkSVRG for strongly convex losses, and NetworkSARAH for quadratic losses, which shed light on the impact of data homogeneity, network connectivity, and local averaging upon the rate of convergence. Numerical evidence is provided to demonstrate the appealing performance of our algorithms over competitive baselines, in terms of both communication and computation efficiency.
Related Material


