Mean Estimation in the Add-Remove Model of Differential Privacy

Alex Kulesza; Ananda Theertha Suresh; Yuyan Wang

Mean Estimation in the Add-Remove Model of Differential Privacy

Alex Kulesza, Ananda Theertha Suresh, Yuyan Wang

Proceedings of the 41st International Conference on Machine Learning, PMLR 235:25642-25661, 2024.

Abstract

Differential privacy is often studied under two different models of neighboring datasets: the add-remove model and the swap model. While the swap model is frequently used in the academic literature to simplify analysis, many practical applications rely on the more conservative add-remove model, where obtaining tight results can be difficult. Here, we study the problem of one-dimensional mean estimation under the add-remove model. We propose a new algorithm and show that it is min-max optimal, achieving the best possible constant in the leading term of the mean squared error for all $\epsilon$, and that this constant is the same as the optimal algorithm under the swap model. These results show that the add-remove and swap models give nearly identical errors for mean estimation, even though the add-remove model cannot treat the size of the dataset as public information. We also demonstrate empirically that our proposed algorithm yields at least a factor of two improvement in mean squared error over algorithms frequently used in practice. One of our main technical contributions is a new hourglass mechanism, which might be of independent interest in other scenarios.

Cite this Paper

BibTeX

@InProceedings{pmlr-v235-kulesza24a,
  title = 	 {Mean Estimation in the Add-Remove Model of Differential Privacy},
  author =       {Kulesza, Alex and Suresh, Ananda Theertha and Wang, Yuyan},
  booktitle = 	 {Proceedings of the 41st International Conference on Machine Learning},
  pages = 	 {25642--25661},
  year = 	 {2024},
  editor = 	 {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix},
  volume = 	 {235},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {21--27 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v235/main/assets/kulesza24a/kulesza24a.pdf},
  url = 	 {https://proceedings.mlr.press/v235/kulesza24a.html},
  abstract = 	 {Differential privacy is often studied under two different models of neighboring datasets: the add-remove model and the swap model. While the swap model is frequently used in the academic literature to simplify analysis, many practical applications rely on the more conservative add-remove model, where obtaining tight results can be difficult. Here, we study the problem of one-dimensional mean estimation under the add-remove model. We propose a new algorithm and show that it is min-max optimal, achieving the best possible constant in the leading term of the mean squared error for all $\epsilon$, and that this constant is the same as the optimal algorithm under the swap model. These results show that the add-remove and swap models give nearly identical errors for mean estimation, even though the add-remove model cannot treat the size of the dataset as public information. We also demonstrate empirically that our proposed algorithm yields at least a factor of two improvement in mean squared error over algorithms frequently used in practice. One of our main technical contributions is a new hourglass mechanism, which might be of independent interest in other scenarios.}
}

Endnote

%0 Conference Paper
%T Mean Estimation in the Add-Remove Model of Differential Privacy
%A Alex Kulesza
%A Ananda Theertha Suresh
%A Yuyan Wang
%B Proceedings of the 41st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2024
%E Ruslan Salakhutdinov
%E Zico Kolter
%E Katherine Heller
%E Adrian Weller
%E Nuria Oliver
%E Jonathan Scarlett
%E Felix Berkenkamp	
%F pmlr-v235-kulesza24a
%I PMLR
%P 25642--25661
%U https://proceedings.mlr.press/v235/kulesza24a.html
%V 235
%X Differential privacy is often studied under two different models of neighboring datasets: the add-remove model and the swap model. While the swap model is frequently used in the academic literature to simplify analysis, many practical applications rely on the more conservative add-remove model, where obtaining tight results can be difficult. Here, we study the problem of one-dimensional mean estimation under the add-remove model. We propose a new algorithm and show that it is min-max optimal, achieving the best possible constant in the leading term of the mean squared error for all $\epsilon$, and that this constant is the same as the optimal algorithm under the swap model. These results show that the add-remove and swap models give nearly identical errors for mean estimation, even though the add-remove model cannot treat the size of the dataset as public information. We also demonstrate empirically that our proposed algorithm yields at least a factor of two improvement in mean squared error over algorithms frequently used in practice. One of our main technical contributions is a new hourglass mechanism, which might be of independent interest in other scenarios.

APA

Kulesza, A., Suresh, A.T. & Wang, Y.. (2024). Mean Estimation in the Add-Remove Model of Differential Privacy. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:25642-25661 Available from https://proceedings.mlr.press/v235/kulesza24a.html.

Mean Estimation in the Add-Remove Model of Differential Privacy

Abstract

Cite this Paper

Related Material