Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis Testing: A Lesson From Fano

Chuan Guo, Alexandre Sablayrolles, Maziar Sanjabi
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:11998-12011, 2023.

Abstract

Differential privacy (DP) is by far the most widely accepted framework for mitigating privacy risks in machine learning. However, exactly how small the privacy parameter ϵ needs to be to protect against certain privacy risks in practice is still not well-understood. In this work, we study data reconstruction attacks for discrete data and analyze it under the framework of multiple hypothesis testing. For a learning algorithm satisfying (α,ϵ)-Renyi DP, we utilize different variants of the celebrated Fano’s inequality to upper bound the attack advantage of a data reconstruction adversary. Our bound can be numerically computed to relate the parameter ϵ to the desired level of privacy protection in practice, and complements the empirical evidence for the effectiveness of DP against data reconstruction attacks even at relatively large values of ϵ.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-guo23e, title = {Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis Testing: A Lesson From Fano}, author = {Guo, Chuan and Sablayrolles, Alexandre and Sanjabi, Maziar}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {11998--12011}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/guo23e/guo23e.pdf}, url = {https://proceedings.mlr.press/v202/guo23e.html}, abstract = {Differential privacy (DP) is by far the most widely accepted framework for mitigating privacy risks in machine learning. However, exactly how small the privacy parameter $\epsilon$ needs to be to protect against certain privacy risks in practice is still not well-understood. In this work, we study data reconstruction attacks for discrete data and analyze it under the framework of multiple hypothesis testing. For a learning algorithm satisfying $(\alpha, \epsilon)$-Renyi DP, we utilize different variants of the celebrated Fano’s inequality to upper bound the attack advantage of a data reconstruction adversary. Our bound can be numerically computed to relate the parameter $\epsilon$ to the desired level of privacy protection in practice, and complements the empirical evidence for the effectiveness of DP against data reconstruction attacks even at relatively large values of $\epsilon$.} }
Endnote
%0 Conference Paper %T Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis Testing: A Lesson From Fano %A Chuan Guo %A Alexandre Sablayrolles %A Maziar Sanjabi %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-guo23e %I PMLR %P 11998--12011 %U https://proceedings.mlr.press/v202/guo23e.html %V 202 %X Differential privacy (DP) is by far the most widely accepted framework for mitigating privacy risks in machine learning. However, exactly how small the privacy parameter $\epsilon$ needs to be to protect against certain privacy risks in practice is still not well-understood. In this work, we study data reconstruction attacks for discrete data and analyze it under the framework of multiple hypothesis testing. For a learning algorithm satisfying $(\alpha, \epsilon)$-Renyi DP, we utilize different variants of the celebrated Fano’s inequality to upper bound the attack advantage of a data reconstruction adversary. Our bound can be numerically computed to relate the parameter $\epsilon$ to the desired level of privacy protection in practice, and complements the empirical evidence for the effectiveness of DP against data reconstruction attacks even at relatively large values of $\epsilon$.
APA
Guo, C., Sablayrolles, A. & Sanjabi, M.. (2023). Analyzing Privacy Leakage in Machine Learning via Multiple Hypothesis Testing: A Lesson From Fano. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:11998-12011 Available from https://proceedings.mlr.press/v202/guo23e.html.

Related Material