Communication-Efficient Federated Learning With Data and Client Heterogeneity

Hossein Zakerinia, Shayan Talaei, Giorgi Nadiradze, Dan Alistarh
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:3448-3456, 2024.

Abstract

Federated Learning (FL) enables large-scale distributed training of machine learning models, while still allowing individual nodes to maintain data locally. However, executing FL at scale comes with inherent practical challenges: 1) heterogeneity of the local node data distributions, 2) heterogeneity of node computational speeds (asynchrony), but also 3) constraints in the amount of communication between the clients and the server. In this work, we present the first variant of the classic federated averaging (FedAvg) algorithm which, at the same time, supports data heterogeneity, partial client asynchrony, and communication compression. Our algorithm comes with a novel, rigorous analysis showing that, in spite of these system relaxations, it can provide similar convergence to FedAvg in interesting parameter regimes. Experimental results in the rigorous LEAF benchmark on setups of up to $300$ nodes show that our algorithm ensures fast convergence for standard federated tasks, improving upon prior quantized and asynchronous approaches.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-zakerinia24a, title = { Communication-Efficient Federated Learning With Data and Client Heterogeneity }, author = {Zakerinia, Hossein and Talaei, Shayan and Nadiradze, Giorgi and Alistarh, Dan}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {3448--3456}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/zakerinia24a/zakerinia24a.pdf}, url = {https://proceedings.mlr.press/v238/zakerinia24a.html}, abstract = { Federated Learning (FL) enables large-scale distributed training of machine learning models, while still allowing individual nodes to maintain data locally. However, executing FL at scale comes with inherent practical challenges: 1) heterogeneity of the local node data distributions, 2) heterogeneity of node computational speeds (asynchrony), but also 3) constraints in the amount of communication between the clients and the server. In this work, we present the first variant of the classic federated averaging (FedAvg) algorithm which, at the same time, supports data heterogeneity, partial client asynchrony, and communication compression. Our algorithm comes with a novel, rigorous analysis showing that, in spite of these system relaxations, it can provide similar convergence to FedAvg in interesting parameter regimes. Experimental results in the rigorous LEAF benchmark on setups of up to $300$ nodes show that our algorithm ensures fast convergence for standard federated tasks, improving upon prior quantized and asynchronous approaches. } }
Endnote
%0 Conference Paper %T Communication-Efficient Federated Learning With Data and Client Heterogeneity %A Hossein Zakerinia %A Shayan Talaei %A Giorgi Nadiradze %A Dan Alistarh %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-zakerinia24a %I PMLR %P 3448--3456 %U https://proceedings.mlr.press/v238/zakerinia24a.html %V 238 %X Federated Learning (FL) enables large-scale distributed training of machine learning models, while still allowing individual nodes to maintain data locally. However, executing FL at scale comes with inherent practical challenges: 1) heterogeneity of the local node data distributions, 2) heterogeneity of node computational speeds (asynchrony), but also 3) constraints in the amount of communication between the clients and the server. In this work, we present the first variant of the classic federated averaging (FedAvg) algorithm which, at the same time, supports data heterogeneity, partial client asynchrony, and communication compression. Our algorithm comes with a novel, rigorous analysis showing that, in spite of these system relaxations, it can provide similar convergence to FedAvg in interesting parameter regimes. Experimental results in the rigorous LEAF benchmark on setups of up to $300$ nodes show that our algorithm ensures fast convergence for standard federated tasks, improving upon prior quantized and asynchronous approaches.
APA
Zakerinia, H., Talaei, S., Nadiradze, G. & Alistarh, D.. (2024). Communication-Efficient Federated Learning With Data and Client Heterogeneity . Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:3448-3456 Available from https://proceedings.mlr.press/v238/zakerinia24a.html.

Related Material