No One Idles: Efficient Heterogeneous Federated Learning with Parallel Edge and Server Computation

Feilong Zhang; Xianming Liu; Shiyi Lin; Gang Wu; Xiong Zhou; Junjun Jiang; Xiangyang Ji

No One Idles: Efficient Heterogeneous Federated Learning with Parallel Edge and Server Computation

Feilong Zhang, Xianming Liu, Shiyi Lin, Gang Wu, Xiong Zhou, Junjun Jiang, Xiangyang Ji

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:41399-41413, 2023.

Abstract

Federated learning suffers from a latency bottleneck induced by network stragglers, which hampers the training efficiency significantly. In addition, due to the heterogeneous data distribution and security requirements, simple and fast averaging aggregation is not feasible anymore. Instead, complicated aggregation operations, such as knowledge distillation, are required. The time cost for complicated aggregation becomes a new bottleneck that limits the computational efficiency of FL. In this work, we claim that the root cause of training latency actually lies in the aggregation-then-broadcasting workflow of the server. By swapping the computational order of aggregation and broadcasting, we propose a novel and efficient parallel federated learning (PFL) framework that unlocks the edge nodes during global computation and the central server during local computation. This fully asynchronous and parallel pipeline enables handling complex aggregation and network stragglers, allowing flexible device participation as well as achieving scalability in computation. We theoretically prove that synchronous and asynchronous PFL can achieve a similar convergence rate as vanilla FL. Extensive experiments empirically show that our framework brings up to

$5.56\times$ speedup compared with traditional FL. Code is available at: https://github.com/Hypervoyager/PFL.

Cite this Paper

BibTeX


@InProceedings{pmlr-v202-zhang23aa,
  title = 	 {No One Idles: Efficient Heterogeneous Federated Learning with Parallel Edge and Server Computation},
  author =       {Zhang, Feilong and Liu, Xianming and Lin, Shiyi and Wu, Gang and Zhou, Xiong and Jiang, Junjun and Ji, Xiangyang},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {41399--41413},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/zhang23aa/zhang23aa.pdf},
  url = 	 {https://proceedings.mlr.press/v202/zhang23aa.html},
  abstract = 	 {Federated learning suffers from a latency bottleneck induced by network stragglers, which hampers the training efficiency significantly. In addition, due to the heterogeneous data distribution and security requirements, simple and fast averaging aggregation is not feasible anymore. Instead, complicated aggregation operations, such as knowledge distillation, are required. The time cost for complicated aggregation becomes a new bottleneck that limits the computational efficiency of FL. In this work, we claim that the root cause of training latency actually lies in the aggregation-then-broadcasting workflow of the server. By swapping the computational order of aggregation and broadcasting, we propose a novel and efficient parallel federated learning (PFL) framework that unlocks the edge nodes during global computation and the central server during local computation. This fully asynchronous and parallel pipeline enables handling complex aggregation and network stragglers, allowing flexible device participation as well as achieving scalability in computation. We theoretically prove that synchronous and asynchronous PFL can achieve a similar convergence rate as vanilla FL. Extensive experiments empirically show that our framework brings up to $5.56\times$ speedup compared with traditional FL. Code is available at: https://github.com/Hypervoyager/PFL.}
}

Endnote

%0 Conference Paper
%T No One Idles: Efficient Heterogeneous Federated Learning with Parallel Edge and Server Computation
%A Feilong Zhang
%A Xianming Liu
%A Shiyi Lin
%A Gang Wu
%A Xiong Zhou
%A Junjun Jiang
%A Xiangyang Ji
%B Proceedings of the 40th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Andreas Krause
%E Emma Brunskill
%E Kyunghyun Cho
%E Barbara Engelhardt
%E Sivan Sabato
%E Jonathan Scarlett	
%F pmlr-v202-zhang23aa
%I PMLR
%P 41399--41413
%U https://proceedings.mlr.press/v202/zhang23aa.html
%V 202
%X Federated learning suffers from a latency bottleneck induced by network stragglers, which hampers the training efficiency significantly. In addition, due to the heterogeneous data distribution and security requirements, simple and fast averaging aggregation is not feasible anymore. Instead, complicated aggregation operations, such as knowledge distillation, are required. The time cost for complicated aggregation becomes a new bottleneck that limits the computational efficiency of FL. In this work, we claim that the root cause of training latency actually lies in the aggregation-then-broadcasting workflow of the server. By swapping the computational order of aggregation and broadcasting, we propose a novel and efficient parallel federated learning (PFL) framework that unlocks the edge nodes during global computation and the central server during local computation. This fully asynchronous and parallel pipeline enables handling complex aggregation and network stragglers, allowing flexible device participation as well as achieving scalability in computation. We theoretically prove that synchronous and asynchronous PFL can achieve a similar convergence rate as vanilla FL. Extensive experiments empirically show that our framework brings up to $5.56\times$ speedup compared with traditional FL. Code is available at: https://github.com/Hypervoyager/PFL.

APA


Zhang, F., Liu, X., Lin, S., Wu, G., Zhou, X., Jiang, J. & Ji, X.. (2023). No One Idles: Efficient Heterogeneous Federated Learning with Parallel Edge and Server Computation. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:41399-41413 Available from https://proceedings.mlr.press/v202/zhang23aa.html.

No One Idles: Efficient Heterogeneous Federated Learning with Parallel Edge and Server Computation

Abstract

Cite this Paper

Related Material