Fishing for User Data in Large-Batch Federated Learning via Gradient Magnification

Yuxin Wen, Jonas A. Geiping, Liam Fowl, Micah Goldblum, Tom Goldstein
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:23668-23684, 2022.

Abstract

Federated learning (FL) has rapidly risen in popularity due to its promise of privacy and efficiency. Previous works have exposed privacy vulnerabilities in the FL pipeline by recovering user data from gradient updates. However, existing attacks fail to address realistic settings because they either 1) require toy settings with very small batch sizes, or 2) require unrealistic and conspicuous architecture modifications. We introduce a new strategy that dramatically elevates existing attacks to operate on batches of arbitrarily large size, and without architectural modifications. Our model-agnostic strategy only requires modifications to the model parameters sent to the user, which is a realistic threat model in many scenarios. We demonstrate the strategy in challenging large-scale settings, obtaining high-fidelity data extraction in both cross-device and cross-silo federated learning. Code is available at https://github.com/JonasGeiping/breaching.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-wen22a, title = {Fishing for User Data in Large-Batch Federated Learning via Gradient Magnification}, author = {Wen, Yuxin and Geiping, Jonas A. and Fowl, Liam and Goldblum, Micah and Goldstein, Tom}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {23668--23684}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/wen22a/wen22a.pdf}, url = {https://proceedings.mlr.press/v162/wen22a.html}, abstract = {Federated learning (FL) has rapidly risen in popularity due to its promise of privacy and efficiency. Previous works have exposed privacy vulnerabilities in the FL pipeline by recovering user data from gradient updates. However, existing attacks fail to address realistic settings because they either 1) require toy settings with very small batch sizes, or 2) require unrealistic and conspicuous architecture modifications. We introduce a new strategy that dramatically elevates existing attacks to operate on batches of arbitrarily large size, and without architectural modifications. Our model-agnostic strategy only requires modifications to the model parameters sent to the user, which is a realistic threat model in many scenarios. We demonstrate the strategy in challenging large-scale settings, obtaining high-fidelity data extraction in both cross-device and cross-silo federated learning. Code is available at https://github.com/JonasGeiping/breaching.} }
Endnote
%0 Conference Paper %T Fishing for User Data in Large-Batch Federated Learning via Gradient Magnification %A Yuxin Wen %A Jonas A. Geiping %A Liam Fowl %A Micah Goldblum %A Tom Goldstein %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-wen22a %I PMLR %P 23668--23684 %U https://proceedings.mlr.press/v162/wen22a.html %V 162 %X Federated learning (FL) has rapidly risen in popularity due to its promise of privacy and efficiency. Previous works have exposed privacy vulnerabilities in the FL pipeline by recovering user data from gradient updates. However, existing attacks fail to address realistic settings because they either 1) require toy settings with very small batch sizes, or 2) require unrealistic and conspicuous architecture modifications. We introduce a new strategy that dramatically elevates existing attacks to operate on batches of arbitrarily large size, and without architectural modifications. Our model-agnostic strategy only requires modifications to the model parameters sent to the user, which is a realistic threat model in many scenarios. We demonstrate the strategy in challenging large-scale settings, obtaining high-fidelity data extraction in both cross-device and cross-silo federated learning. Code is available at https://github.com/JonasGeiping/breaching.
APA
Wen, Y., Geiping, J.A., Fowl, L., Goldblum, M. & Goldstein, T.. (2022). Fishing for User Data in Large-Batch Federated Learning via Gradient Magnification. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:23668-23684 Available from https://proceedings.mlr.press/v162/wen22a.html.

Related Material