On the Privacy-Robustness-Utility Trilemma in Distributed Learning

Youssef Allouah, Rachid Guerraoui, Nirupam Gupta, Rafael Pinot, John Stephan
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:569-626, 2023.

Abstract

The ubiquity of distributed machine learning (ML) in sensitive public domain applications calls for algorithms that protect data privacy, while being robust to faults and adversarial behaviors. Although privacy and robustness have been extensively studied independently in distributed ML, their synthesis remains poorly understood. We present the first tight analysis of the error incurred by any algorithm ensuring robustness against a fraction of adversarial machines, as well as differential privacy (DP) for honest machines’ data against any other curious entity. Our analysis exhibits a fundamental trade-off between privacy, robustness, and utility. To prove our lower bound, we consider the case of mean estimation, subject to distributed DP and robustness constraints, and devise reductions to centralized estimation of one-way marginals. We prove our matching upper bound by presenting a new distributed ML algorithm using a high-dimensional robust aggregation rule. The latter amortizes the dependence on the dimension in the error (caused by adversarial workers and DP), while being agnostic to the statistical properties of the data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-allouah23a, title = {On the Privacy-Robustness-Utility Trilemma in Distributed Learning}, author = {Allouah, Youssef and Guerraoui, Rachid and Gupta, Nirupam and Pinot, Rafael and Stephan, John}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {569--626}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/allouah23a/allouah23a.pdf}, url = {https://proceedings.mlr.press/v202/allouah23a.html}, abstract = {The ubiquity of distributed machine learning (ML) in sensitive public domain applications calls for algorithms that protect data privacy, while being robust to faults and adversarial behaviors. Although privacy and robustness have been extensively studied independently in distributed ML, their synthesis remains poorly understood. We present the first tight analysis of the error incurred by any algorithm ensuring robustness against a fraction of adversarial machines, as well as differential privacy (DP) for honest machines’ data against any other curious entity. Our analysis exhibits a fundamental trade-off between privacy, robustness, and utility. To prove our lower bound, we consider the case of mean estimation, subject to distributed DP and robustness constraints, and devise reductions to centralized estimation of one-way marginals. We prove our matching upper bound by presenting a new distributed ML algorithm using a high-dimensional robust aggregation rule. The latter amortizes the dependence on the dimension in the error (caused by adversarial workers and DP), while being agnostic to the statistical properties of the data.} }
Endnote
%0 Conference Paper %T On the Privacy-Robustness-Utility Trilemma in Distributed Learning %A Youssef Allouah %A Rachid Guerraoui %A Nirupam Gupta %A Rafael Pinot %A John Stephan %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-allouah23a %I PMLR %P 569--626 %U https://proceedings.mlr.press/v202/allouah23a.html %V 202 %X The ubiquity of distributed machine learning (ML) in sensitive public domain applications calls for algorithms that protect data privacy, while being robust to faults and adversarial behaviors. Although privacy and robustness have been extensively studied independently in distributed ML, their synthesis remains poorly understood. We present the first tight analysis of the error incurred by any algorithm ensuring robustness against a fraction of adversarial machines, as well as differential privacy (DP) for honest machines’ data against any other curious entity. Our analysis exhibits a fundamental trade-off between privacy, robustness, and utility. To prove our lower bound, we consider the case of mean estimation, subject to distributed DP and robustness constraints, and devise reductions to centralized estimation of one-way marginals. We prove our matching upper bound by presenting a new distributed ML algorithm using a high-dimensional robust aggregation rule. The latter amortizes the dependence on the dimension in the error (caused by adversarial workers and DP), while being agnostic to the statistical properties of the data.
APA
Allouah, Y., Guerraoui, R., Gupta, N., Pinot, R. & Stephan, J.. (2023). On the Privacy-Robustness-Utility Trilemma in Distributed Learning. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:569-626 Available from https://proceedings.mlr.press/v202/allouah23a.html.

Related Material