[edit]
Statistical Estimation and Online Inference via Local SGD
Proceedings of Thirty Fifth Conference on Learning Theory, PMLR 178:1613-1661, 2022.
Abstract
We analyze the novel Local SGD in federated Learning, a multi-round estimation procedure that uses intermittent communication to improve communication efficiency. Under a $2{+}\delta$ moment condition on stochastic gradients, we first establish a {\it functional central limit theorem} that shows the averaged iterates of Local SGD converge weakly to a rescaled Brownian motion. We next provide two iterative inference methods: the {\it plug-in} and the {\it random scaling}. Random scaling constructs an asymptotically pivotal statistic for inference by using the information along the whole Local SGD path. Both the methods are communication efficient and applicable to online data. Our results show that Local SGD simultaneously achieves both statistical efficiency and communication efficiency.