On the Tension between Byzantine Robustness and No-Attack Accuracy in Distributed Learning

Yi-Rui Yang, Chang-Wei Shi, Wu-Jun Li
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:71051-71072, 2025.

Abstract

Byzantine-robust distributed learning (BRDL), which refers to distributed learning that can work with potential faulty or malicious workers (also known as Byzantine workers), has recently attracted much research attention. Robust aggregators are widely used in existing BRDL methods to obtain robustness against Byzantine workers. However, Byzantine workers do not always exist in applications. As far as we know, there is almost no existing work theoretically investigating the effect of using robust aggregators when there are no Byzantine workers. To bridge this knowledge gap, we theoretically analyze the aggregation error for robust aggregators when there are no Byzantine workers. Specifically, we show that the worst-case aggregation error without Byzantine workers increases with the increase of the number of Byzantine workers that a robust aggregator can tolerate. The theoretical result reveals the tension between Byzantine robustness and no-attack accuracy, which refers to accuracy without faulty workers and malicious workers in this paper. Furthermore, we provide lower bounds for the convergence rate of gradient descent with robust aggregators for non-convex objective functions and objective functions that satisfy the Polyak-Lojasiewicz (PL) condition, respectively. We also prove the tightness of the lower bounds. The lower bounds for convergence rate reveal similar tension between Byzantine robustness and no-attack accuracy. Empirical results further support our theoretical findings.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-yang25aa, title = {On the Tension between {B}yzantine Robustness and No-Attack Accuracy in Distributed Learning}, author = {Yang, Yi-Rui and Shi, Chang-Wei and Li, Wu-Jun}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {71051--71072}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/yang25aa/yang25aa.pdf}, url = {https://proceedings.mlr.press/v267/yang25aa.html}, abstract = {Byzantine-robust distributed learning (BRDL), which refers to distributed learning that can work with potential faulty or malicious workers (also known as Byzantine workers), has recently attracted much research attention. Robust aggregators are widely used in existing BRDL methods to obtain robustness against Byzantine workers. However, Byzantine workers do not always exist in applications. As far as we know, there is almost no existing work theoretically investigating the effect of using robust aggregators when there are no Byzantine workers. To bridge this knowledge gap, we theoretically analyze the aggregation error for robust aggregators when there are no Byzantine workers. Specifically, we show that the worst-case aggregation error without Byzantine workers increases with the increase of the number of Byzantine workers that a robust aggregator can tolerate. The theoretical result reveals the tension between Byzantine robustness and no-attack accuracy, which refers to accuracy without faulty workers and malicious workers in this paper. Furthermore, we provide lower bounds for the convergence rate of gradient descent with robust aggregators for non-convex objective functions and objective functions that satisfy the Polyak-Lojasiewicz (PL) condition, respectively. We also prove the tightness of the lower bounds. The lower bounds for convergence rate reveal similar tension between Byzantine robustness and no-attack accuracy. Empirical results further support our theoretical findings.} }
Endnote
%0 Conference Paper %T On the Tension between Byzantine Robustness and No-Attack Accuracy in Distributed Learning %A Yi-Rui Yang %A Chang-Wei Shi %A Wu-Jun Li %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-yang25aa %I PMLR %P 71051--71072 %U https://proceedings.mlr.press/v267/yang25aa.html %V 267 %X Byzantine-robust distributed learning (BRDL), which refers to distributed learning that can work with potential faulty or malicious workers (also known as Byzantine workers), has recently attracted much research attention. Robust aggregators are widely used in existing BRDL methods to obtain robustness against Byzantine workers. However, Byzantine workers do not always exist in applications. As far as we know, there is almost no existing work theoretically investigating the effect of using robust aggregators when there are no Byzantine workers. To bridge this knowledge gap, we theoretically analyze the aggregation error for robust aggregators when there are no Byzantine workers. Specifically, we show that the worst-case aggregation error without Byzantine workers increases with the increase of the number of Byzantine workers that a robust aggregator can tolerate. The theoretical result reveals the tension between Byzantine robustness and no-attack accuracy, which refers to accuracy without faulty workers and malicious workers in this paper. Furthermore, we provide lower bounds for the convergence rate of gradient descent with robust aggregators for non-convex objective functions and objective functions that satisfy the Polyak-Lojasiewicz (PL) condition, respectively. We also prove the tightness of the lower bounds. The lower bounds for convergence rate reveal similar tension between Byzantine robustness and no-attack accuracy. Empirical results further support our theoretical findings.
APA
Yang, Y., Shi, C. & Li, W.. (2025). On the Tension between Byzantine Robustness and No-Attack Accuracy in Distributed Learning. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:71051-71072 Available from https://proceedings.mlr.press/v267/yang25aa.html.

Related Material