Queuing dynamics of asynchronous Federated Learning

Louis Leconte, Matthieu Jonckheere, Sergey Samsonov, Eric Moulines
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:1711-1719, 2024.

Abstract

We study asynchronous federated learning mechanisms with nodes having potentially different computational speeds. In such an environment, each node is allowed to work on models with potential delays and contribute to updates to the central server at its own pace. Existing analyses of such algorithms typically depend on intractable quantities such as the maximum node delay and do not consider the underlying queuing dynamics of the system. In this paper, we propose a non-uniform sampling scheme for the central server that allows for lower delays with better complexity, taking into account the closed Jackson network structure of the associated computational graph. Our experiments clearly show a significant improvement of our method over current state-of-the-art asynchronous algorithms on image classification problems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-leconte24a, title = { Queuing dynamics of asynchronous Federated Learning }, author = {Leconte, Louis and Jonckheere, Matthieu and Samsonov, Sergey and Moulines, Eric}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {1711--1719}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/leconte24a/leconte24a.pdf}, url = {https://proceedings.mlr.press/v238/leconte24a.html}, abstract = { We study asynchronous federated learning mechanisms with nodes having potentially different computational speeds. In such an environment, each node is allowed to work on models with potential delays and contribute to updates to the central server at its own pace. Existing analyses of such algorithms typically depend on intractable quantities such as the maximum node delay and do not consider the underlying queuing dynamics of the system. In this paper, we propose a non-uniform sampling scheme for the central server that allows for lower delays with better complexity, taking into account the closed Jackson network structure of the associated computational graph. Our experiments clearly show a significant improvement of our method over current state-of-the-art asynchronous algorithms on image classification problems. } }
Endnote
%0 Conference Paper %T Queuing dynamics of asynchronous Federated Learning %A Louis Leconte %A Matthieu Jonckheere %A Sergey Samsonov %A Eric Moulines %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-leconte24a %I PMLR %P 1711--1719 %U https://proceedings.mlr.press/v238/leconte24a.html %V 238 %X We study asynchronous federated learning mechanisms with nodes having potentially different computational speeds. In such an environment, each node is allowed to work on models with potential delays and contribute to updates to the central server at its own pace. Existing analyses of such algorithms typically depend on intractable quantities such as the maximum node delay and do not consider the underlying queuing dynamics of the system. In this paper, we propose a non-uniform sampling scheme for the central server that allows for lower delays with better complexity, taking into account the closed Jackson network structure of the associated computational graph. Our experiments clearly show a significant improvement of our method over current state-of-the-art asynchronous algorithms on image classification problems.
APA
Leconte, L., Jonckheere, M., Samsonov, S. & Moulines, E.. (2024). Queuing dynamics of asynchronous Federated Learning . Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:1711-1719 Available from https://proceedings.mlr.press/v238/leconte24a.html.

Related Material