Communication Compression for Byzantine Robust Learning: New Efficient Algorithms and Improved Rates

Ahmad Rammal, Kaja Gruntkowska, Nikita Fedin, Eduard Gorbunov, Peter Richtarik
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:1207-1215, 2024.

Abstract

Byzantine robustness is an essential feature of algorithms for certain distributed optimization problems, typically encountered in collaborative/federated learning. These problems are usually huge-scale, implying that communication compression is also imperative for their resolution. These factors have spurred recent algorithmic and theoretical developments in the literature of Byzantine-robust learning with compression. In this paper, we contribute to this research area in two main directions. First, we propose a new Byzantine-robust method with compression – Byz-DASHA-PAGE – and prove that the new method has better convergence rate (for non-convex and Polyak-Lojasiewicz smooth optimization problems), smaller neighborhood size in the heterogeneous case, and tolerates more Byzantine workers under over-parametrization than the previous method with SOTA theoretical convergence guarantees (Byz-VR-MARINA). Secondly, we develop the first Byzantine-robust method with communication compression and error feedback – Byz-EF21 – along with its bi-directional compression version – Byz-EF21-BC – and derive the convergence rates for these methods for non-convex and Polyak-Lojasiewicz smooth case. We test the proposed methods and illustrate our theoretical findings in the numerical experiments.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-rammal24a, title = {Communication Compression for {B}yzantine Robust Learning: New Efficient Algorithms and Improved Rates}, author = {Rammal, Ahmad and Gruntkowska, Kaja and Fedin, Nikita and Gorbunov, Eduard and Richtarik, Peter}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {1207--1215}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/rammal24a/rammal24a.pdf}, url = {https://proceedings.mlr.press/v238/rammal24a.html}, abstract = {Byzantine robustness is an essential feature of algorithms for certain distributed optimization problems, typically encountered in collaborative/federated learning. These problems are usually huge-scale, implying that communication compression is also imperative for their resolution. These factors have spurred recent algorithmic and theoretical developments in the literature of Byzantine-robust learning with compression. In this paper, we contribute to this research area in two main directions. First, we propose a new Byzantine-robust method with compression – Byz-DASHA-PAGE – and prove that the new method has better convergence rate (for non-convex and Polyak-Lojasiewicz smooth optimization problems), smaller neighborhood size in the heterogeneous case, and tolerates more Byzantine workers under over-parametrization than the previous method with SOTA theoretical convergence guarantees (Byz-VR-MARINA). Secondly, we develop the first Byzantine-robust method with communication compression and error feedback – Byz-EF21 – along with its bi-directional compression version – Byz-EF21-BC – and derive the convergence rates for these methods for non-convex and Polyak-Lojasiewicz smooth case. We test the proposed methods and illustrate our theoretical findings in the numerical experiments.} }
Endnote
%0 Conference Paper %T Communication Compression for Byzantine Robust Learning: New Efficient Algorithms and Improved Rates %A Ahmad Rammal %A Kaja Gruntkowska %A Nikita Fedin %A Eduard Gorbunov %A Peter Richtarik %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-rammal24a %I PMLR %P 1207--1215 %U https://proceedings.mlr.press/v238/rammal24a.html %V 238 %X Byzantine robustness is an essential feature of algorithms for certain distributed optimization problems, typically encountered in collaborative/federated learning. These problems are usually huge-scale, implying that communication compression is also imperative for their resolution. These factors have spurred recent algorithmic and theoretical developments in the literature of Byzantine-robust learning with compression. In this paper, we contribute to this research area in two main directions. First, we propose a new Byzantine-robust method with compression – Byz-DASHA-PAGE – and prove that the new method has better convergence rate (for non-convex and Polyak-Lojasiewicz smooth optimization problems), smaller neighborhood size in the heterogeneous case, and tolerates more Byzantine workers under over-parametrization than the previous method with SOTA theoretical convergence guarantees (Byz-VR-MARINA). Secondly, we develop the first Byzantine-robust method with communication compression and error feedback – Byz-EF21 – along with its bi-directional compression version – Byz-EF21-BC – and derive the convergence rates for these methods for non-convex and Polyak-Lojasiewicz smooth case. We test the proposed methods and illustrate our theoretical findings in the numerical experiments.
APA
Rammal, A., Gruntkowska, K., Fedin, N., Gorbunov, E. & Richtarik, P.. (2024). Communication Compression for Byzantine Robust Learning: New Efficient Algorithms and Improved Rates. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:1207-1215 Available from https://proceedings.mlr.press/v238/rammal24a.html.

Related Material