Sample-and-threshold differential privacy: Histograms and applications

Graham Cormode, Akash Bharadwaj
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:1420-1431, 2022.

Abstract

Federated analytics seeks to compute accurate statistics from data distributed across users’ devices while providing a suitable privacy guarantee and being practically feasible to implement and scale. In this paper, we show how a strong (epsilon, delta)-Differential Privacy (DP) guarantee can be achieved for the fundamental problem of histogram generation in a federated setting, via a highly practical sampling-based procedure that does not add noise to disclosed data. Given the ubiquity of sampling in practice, we thus obtain a DP guarantee almost for free, avoid over-estimating histogram counts, and allow easy reasoning about how privacy guarantees may obscure minorities and outliers. Using such histograms, related problems such as heavy hitters and quantiles can be answered with provable error and privacy guarantees. Experimental results show that our sample-and-threshold approach is accurate and scalable.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-cormode22a, title = { Sample-and-threshold differential privacy: Histograms and applications }, author = {Cormode, Graham and Bharadwaj, Akash}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {1420--1431}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/cormode22a/cormode22a.pdf}, url = {https://proceedings.mlr.press/v151/cormode22a.html}, abstract = { Federated analytics seeks to compute accurate statistics from data distributed across users’ devices while providing a suitable privacy guarantee and being practically feasible to implement and scale. In this paper, we show how a strong (epsilon, delta)-Differential Privacy (DP) guarantee can be achieved for the fundamental problem of histogram generation in a federated setting, via a highly practical sampling-based procedure that does not add noise to disclosed data. Given the ubiquity of sampling in practice, we thus obtain a DP guarantee almost for free, avoid over-estimating histogram counts, and allow easy reasoning about how privacy guarantees may obscure minorities and outliers. Using such histograms, related problems such as heavy hitters and quantiles can be answered with provable error and privacy guarantees. Experimental results show that our sample-and-threshold approach is accurate and scalable. } }
Endnote
%0 Conference Paper %T Sample-and-threshold differential privacy: Histograms and applications %A Graham Cormode %A Akash Bharadwaj %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-cormode22a %I PMLR %P 1420--1431 %U https://proceedings.mlr.press/v151/cormode22a.html %V 151 %X Federated analytics seeks to compute accurate statistics from data distributed across users’ devices while providing a suitable privacy guarantee and being practically feasible to implement and scale. In this paper, we show how a strong (epsilon, delta)-Differential Privacy (DP) guarantee can be achieved for the fundamental problem of histogram generation in a federated setting, via a highly practical sampling-based procedure that does not add noise to disclosed data. Given the ubiquity of sampling in practice, we thus obtain a DP guarantee almost for free, avoid over-estimating histogram counts, and allow easy reasoning about how privacy guarantees may obscure minorities and outliers. Using such histograms, related problems such as heavy hitters and quantiles can be answered with provable error and privacy guarantees. Experimental results show that our sample-and-threshold approach is accurate and scalable.
APA
Cormode, G. & Bharadwaj, A.. (2022). Sample-and-threshold differential privacy: Histograms and applications . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:1420-1431 Available from https://proceedings.mlr.press/v151/cormode22a.html.

Related Material