Differential privacy: general inferential limits via intervals of measures

James Bailie, Ruobin Gong
Proceedings of the Thirteenth International Symposium on Imprecise Probability: Theories and Applications, PMLR 215:11-24, 2023.

Abstract

Differential privacy (DP) is a mathematical standard for assessing the privacy provided by a data-release mechanism. We provide formulations of pure $\epsilon$-differential privacy first as a Lipschitz continuity condition and then using an object from the imprecise probability literature: the interval of measures. We utilise this second formulation to establish bounds on the appropriate likelihood function for $\epsilon$-DP data – and in turn derive limits on key quantities in both frequentist hypothesis testing and Bayesian inference. Under very mild conditions, these results are valid for arbitrary parameters, priors and data generating models. These bounds are weaker than those attainable when analysing specific data generating models or data-release mechanisms. However, they provide generally applicable limits on the ability to learn from differentially private data – even when the analyst’s knowledge of the model or mechanism is limited. They also shed light on the semantic interpretation of differential privacy, a subject of contention in the current literature.

Cite this Paper


BibTeX
@InProceedings{pmlr-v215-bailie23a, title = {Differential privacy: general inferential limits via intervals of measures}, author = {Bailie, James and Gong, Ruobin}, booktitle = {Proceedings of the Thirteenth International Symposium on Imprecise Probability: Theories and Applications}, pages = {11--24}, year = {2023}, editor = {Miranda, Enrique and Montes, Ignacio and Quaeghebeur, Erik and Vantaggi, Barbara}, volume = {215}, series = {Proceedings of Machine Learning Research}, month = {11--14 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v215/bailie23a/bailie23a.pdf}, url = {https://proceedings.mlr.press/v215/bailie23a.html}, abstract = {Differential privacy (DP) is a mathematical standard for assessing the privacy provided by a data-release mechanism. We provide formulations of pure $\epsilon$-differential privacy first as a Lipschitz continuity condition and then using an object from the imprecise probability literature: the interval of measures. We utilise this second formulation to establish bounds on the appropriate likelihood function for $\epsilon$-DP data – and in turn derive limits on key quantities in both frequentist hypothesis testing and Bayesian inference. Under very mild conditions, these results are valid for arbitrary parameters, priors and data generating models. These bounds are weaker than those attainable when analysing specific data generating models or data-release mechanisms. However, they provide generally applicable limits on the ability to learn from differentially private data – even when the analyst’s knowledge of the model or mechanism is limited. They also shed light on the semantic interpretation of differential privacy, a subject of contention in the current literature.} }
Endnote
%0 Conference Paper %T Differential privacy: general inferential limits via intervals of measures %A James Bailie %A Ruobin Gong %B Proceedings of the Thirteenth International Symposium on Imprecise Probability: Theories and Applications %C Proceedings of Machine Learning Research %D 2023 %E Enrique Miranda %E Ignacio Montes %E Erik Quaeghebeur %E Barbara Vantaggi %F pmlr-v215-bailie23a %I PMLR %P 11--24 %U https://proceedings.mlr.press/v215/bailie23a.html %V 215 %X Differential privacy (DP) is a mathematical standard for assessing the privacy provided by a data-release mechanism. We provide formulations of pure $\epsilon$-differential privacy first as a Lipschitz continuity condition and then using an object from the imprecise probability literature: the interval of measures. We utilise this second formulation to establish bounds on the appropriate likelihood function for $\epsilon$-DP data – and in turn derive limits on key quantities in both frequentist hypothesis testing and Bayesian inference. Under very mild conditions, these results are valid for arbitrary parameters, priors and data generating models. These bounds are weaker than those attainable when analysing specific data generating models or data-release mechanisms. However, they provide generally applicable limits on the ability to learn from differentially private data – even when the analyst’s knowledge of the model or mechanism is limited. They also shed light on the semantic interpretation of differential privacy, a subject of contention in the current literature.
APA
Bailie, J. & Gong, R.. (2023). Differential privacy: general inferential limits via intervals of measures. Proceedings of the Thirteenth International Symposium on Imprecise Probability: Theories and Applications, in Proceedings of Machine Learning Research 215:11-24 Available from https://proceedings.mlr.press/v215/bailie23a.html.

Related Material