Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning

Christopher A. Choquette-Choo, Hugh Brendan Mcmahan, J Keith Rush, Abhradeep Guha Thakurta
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:5924-5963, 2023.

Abstract

We introduce new differentially private (DP) mechanisms for gradient-based machine learning (ML) with multiple passes (epochs) over a dataset, substantially improving the achievable privacy-utility-computation tradeoffs. We formalize the problem of DP mechanisms for adaptive streams with multiple participations and introduce a non-trivial extension of online matrix factorization DP mechanisms to our setting. This includes establishing the necessary theory for sensitivity calculations and efficient computation of optimal matrices. For some applications like $>\!\! 10,000$ SGD steps, applying these optimal techniques becomes computationally expensive. We thus design an efficient Fourier-transform-based mechanism with only a minor utility loss. Extensive empirical evaluation on both example-level DP for image classification and user-level DP for language modeling demonstrate substantial improvements over all previous methods, including the widely-used DP-SGD. Though our primary application is to ML, our main DP results are applicable to arbitrary linear queries and hence may have much broader applicability.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-choquette-choo23a, title = {Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning}, author = {Choquette-Choo, Christopher A. and Mcmahan, Hugh Brendan and Rush, J Keith and Guha Thakurta, Abhradeep}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {5924--5963}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/choquette-choo23a/choquette-choo23a.pdf}, url = {https://proceedings.mlr.press/v202/choquette-choo23a.html}, abstract = {We introduce new differentially private (DP) mechanisms for gradient-based machine learning (ML) with multiple passes (epochs) over a dataset, substantially improving the achievable privacy-utility-computation tradeoffs. We formalize the problem of DP mechanisms for adaptive streams with multiple participations and introduce a non-trivial extension of online matrix factorization DP mechanisms to our setting. This includes establishing the necessary theory for sensitivity calculations and efficient computation of optimal matrices. For some applications like $>\!\! 10,000$ SGD steps, applying these optimal techniques becomes computationally expensive. We thus design an efficient Fourier-transform-based mechanism with only a minor utility loss. Extensive empirical evaluation on both example-level DP for image classification and user-level DP for language modeling demonstrate substantial improvements over all previous methods, including the widely-used DP-SGD. Though our primary application is to ML, our main DP results are applicable to arbitrary linear queries and hence may have much broader applicability.} }
Endnote
%0 Conference Paper %T Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning %A Christopher A. Choquette-Choo %A Hugh Brendan Mcmahan %A J Keith Rush %A Abhradeep Guha Thakurta %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-choquette-choo23a %I PMLR %P 5924--5963 %U https://proceedings.mlr.press/v202/choquette-choo23a.html %V 202 %X We introduce new differentially private (DP) mechanisms for gradient-based machine learning (ML) with multiple passes (epochs) over a dataset, substantially improving the achievable privacy-utility-computation tradeoffs. We formalize the problem of DP mechanisms for adaptive streams with multiple participations and introduce a non-trivial extension of online matrix factorization DP mechanisms to our setting. This includes establishing the necessary theory for sensitivity calculations and efficient computation of optimal matrices. For some applications like $>\!\! 10,000$ SGD steps, applying these optimal techniques becomes computationally expensive. We thus design an efficient Fourier-transform-based mechanism with only a minor utility loss. Extensive empirical evaluation on both example-level DP for image classification and user-level DP for language modeling demonstrate substantial improvements over all previous methods, including the widely-used DP-SGD. Though our primary application is to ML, our main DP results are applicable to arbitrary linear queries and hence may have much broader applicability.
APA
Choquette-Choo, C.A., Mcmahan, H.B., Rush, J.K. & Guha Thakurta, A.. (2023). Multi-Epoch Matrix Factorization Mechanisms for Private Machine Learning. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:5924-5963 Available from https://proceedings.mlr.press/v202/choquette-choo23a.html.

Related Material