Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates

Dong Yin, Yudong Chen, Ramchandran Kannan, Peter Bartlett
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:5650-5659, 2018.

Abstract

In this paper, we develop distributed optimization algorithms that are provably robust against Byzantine failures—arbitrary and potentially adversarial behavior, in distributed computing systems, with a focus on achieving optimal statistical performance. A main result of this work is a sharp analysis of two robust distributed gradient descent algorithms based on median and trimmed mean operations, respectively. We prove statistical error rates for all of strongly convex, non-strongly convex, and smooth non-convex population loss functions. In particular, these algorithms are shown to achieve order-optimal statistical error rates for strongly convex losses. To achieve better communication efficiency, we further propose a median-based distributed algorithm that is provably robust, and uses only one communication round. For strongly convex quadratic loss, we show that this algorithm achieves the same optimal error rate as the robust distributed gradient descent algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-yin18a, title = {{B}yzantine-Robust Distributed Learning: Towards Optimal Statistical Rates}, author = {Yin, Dong and Chen, Yudong and Kannan, Ramchandran and Bartlett, Peter}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {5650--5659}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/yin18a/yin18a.pdf}, url = {https://proceedings.mlr.press/v80/yin18a.html}, abstract = {In this paper, we develop distributed optimization algorithms that are provably robust against Byzantine failures—arbitrary and potentially adversarial behavior, in distributed computing systems, with a focus on achieving optimal statistical performance. A main result of this work is a sharp analysis of two robust distributed gradient descent algorithms based on median and trimmed mean operations, respectively. We prove statistical error rates for all of strongly convex, non-strongly convex, and smooth non-convex population loss functions. In particular, these algorithms are shown to achieve order-optimal statistical error rates for strongly convex losses. To achieve better communication efficiency, we further propose a median-based distributed algorithm that is provably robust, and uses only one communication round. For strongly convex quadratic loss, we show that this algorithm achieves the same optimal error rate as the robust distributed gradient descent algorithms.} }
Endnote
%0 Conference Paper %T Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates %A Dong Yin %A Yudong Chen %A Ramchandran Kannan %A Peter Bartlett %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-yin18a %I PMLR %P 5650--5659 %U https://proceedings.mlr.press/v80/yin18a.html %V 80 %X In this paper, we develop distributed optimization algorithms that are provably robust against Byzantine failures—arbitrary and potentially adversarial behavior, in distributed computing systems, with a focus on achieving optimal statistical performance. A main result of this work is a sharp analysis of two robust distributed gradient descent algorithms based on median and trimmed mean operations, respectively. We prove statistical error rates for all of strongly convex, non-strongly convex, and smooth non-convex population loss functions. In particular, these algorithms are shown to achieve order-optimal statistical error rates for strongly convex losses. To achieve better communication efficiency, we further propose a median-based distributed algorithm that is provably robust, and uses only one communication round. For strongly convex quadratic loss, we show that this algorithm achieves the same optimal error rate as the robust distributed gradient descent algorithms.
APA
Yin, D., Chen, Y., Kannan, R. & Bartlett, P.. (2018). Byzantine-Robust Distributed Learning: Towards Optimal Statistical Rates. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:5650-5659 Available from https://proceedings.mlr.press/v80/yin18a.html.

Related Material