Private Counting from Anonymous Messages: Near-Optimal Accuracy with Vanishing Communication Overhead

Badih Ghazi; Ravi Kumar; Pasin Manurangsi; Rasmus Pagh

Private Counting from Anonymous Messages: Near-Optimal Accuracy with Vanishing Communication Overhead

Badih Ghazi, Ravi Kumar, Pasin Manurangsi, Rasmus Pagh

Proceedings of the 37th International Conference on Machine Learning, PMLR 119:3505-3514, 2020.

Abstract

Differential privacy (DP) is a formal notion for quantifying the privacy loss of algorithms. Algorithms in the central model of DP achieve high accuracy but make the strongest trust assumptions whereas those in the local DP model make the weakest trust assumptions but incur substantial accuracy loss. The shuffled DP model [Bittau et al 2017, Erlingsson et al 2019, Cheu et al 19] has recently emerged as a feasible middle ground between the central and local models, providing stronger trust assumptions than the former while promising higher accuracies than the latter. In this paper, we obtain practical communication-efficient algorithms in the shuffled DP model for two basic aggregation primitives used in machine learning: 1) binary summation, and 2) histograms over a moderate number of buckets. Our algorithms achieve accuracy that is arbitrarily close to that of central DP algorithms with an expected communication per user essentially matching what is needed without any privacy constraints! We demonstrate the practicality of our algorithms by experimentally evaluating them and comparing their performance to several widely-used protocols such as Randomized Response [Warner 1965] and RAPPOR [Erlingsson et al. 2014].

Cite this Paper

BibTeX

@InProceedings{pmlr-v119-ghazi20a,
  title = 	 {Private Counting from Anonymous Messages: Near-Optimal Accuracy with Vanishing Communication Overhead},
  author =       {Ghazi, Badih and Kumar, Ravi and Manurangsi, Pasin and Pagh, Rasmus},
  booktitle = 	 {Proceedings of the 37th International Conference on Machine Learning},
  pages = 	 {3505--3514},
  year = 	 {2020},
  editor = 	 {III, Hal Daumé and Singh, Aarti},
  volume = 	 {119},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--18 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v119/ghazi20a/ghazi20a.pdf},
  url = 	 {https://proceedings.mlr.press/v119/ghazi20a.html},
  abstract = 	 {Differential privacy (DP) is a formal notion for quantifying the privacy loss of algorithms. Algorithms in the central model of DP achieve high accuracy but make the strongest trust assumptions whereas those in the local DP model make the weakest trust assumptions but incur substantial accuracy loss. The shuffled DP model [Bittau et al 2017, Erlingsson et al 2019, Cheu et al 19] has recently emerged as a feasible middle ground between the central and local models, providing stronger trust assumptions than the former while promising higher accuracies than the latter. In this paper, we obtain practical communication-efficient algorithms in the shuffled DP model for two basic aggregation primitives used in machine learning: 1) binary summation, and 2) histograms over a moderate number of buckets. Our algorithms achieve accuracy that is arbitrarily close to that of central DP algorithms with an expected communication per user essentially matching what is needed without any privacy constraints! We demonstrate the practicality of our algorithms by experimentally evaluating them and comparing their performance to several widely-used protocols such as Randomized Response [Warner 1965] and RAPPOR [Erlingsson et al. 2014].}
}

Endnote

%0 Conference Paper
%T Private Counting from Anonymous Messages: Near-Optimal Accuracy with Vanishing Communication Overhead
%A Badih Ghazi
%A Ravi Kumar
%A Pasin Manurangsi
%A Rasmus Pagh
%B Proceedings of the 37th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Hal Daumé III
%E Aarti Singh	
%F pmlr-v119-ghazi20a
%I PMLR
%P 3505--3514
%U https://proceedings.mlr.press/v119/ghazi20a.html
%V 119
%X Differential privacy (DP) is a formal notion for quantifying the privacy loss of algorithms. Algorithms in the central model of DP achieve high accuracy but make the strongest trust assumptions whereas those in the local DP model make the weakest trust assumptions but incur substantial accuracy loss. The shuffled DP model [Bittau et al 2017, Erlingsson et al 2019, Cheu et al 19] has recently emerged as a feasible middle ground between the central and local models, providing stronger trust assumptions than the former while promising higher accuracies than the latter. In this paper, we obtain practical communication-efficient algorithms in the shuffled DP model for two basic aggregation primitives used in machine learning: 1) binary summation, and 2) histograms over a moderate number of buckets. Our algorithms achieve accuracy that is arbitrarily close to that of central DP algorithms with an expected communication per user essentially matching what is needed without any privacy constraints! We demonstrate the practicality of our algorithms by experimentally evaluating them and comparing their performance to several widely-used protocols such as Randomized Response [Warner 1965] and RAPPOR [Erlingsson et al. 2014].

APA

Ghazi, B., Kumar, R., Manurangsi, P. & Pagh, R.. (2020). Private Counting from Anonymous Messages: Near-Optimal Accuracy with Vanishing Communication Overhead. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:3505-3514 Available from https://proceedings.mlr.press/v119/ghazi20a.html.

Private Counting from Anonymous Messages: Near-Optimal Accuracy with Vanishing Communication Overhead

Abstract

Cite this Paper

Related Material