Mean Estimation in the Add-Remove Model of Differential Privacy

Alex Kulesza, Ananda Theertha Suresh, Yuyan Wang
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:25642-25661, 2024.

Abstract

Differential privacy is often studied under two different models of neighboring datasets: the add-remove model and the swap model. While the swap model is frequently used in the academic literature to simplify analysis, many practical applications rely on the more conservative add-remove model, where obtaining tight results can be difficult. Here, we study the problem of one-dimensional mean estimation under the add-remove model. We propose a new algorithm and show that it is min-max optimal, achieving the best possible constant in the leading term of the mean squared error for all ϵ, and that this constant is the same as the optimal algorithm under the swap model. These results show that the add-remove and swap models give nearly identical errors for mean estimation, even though the add-remove model cannot treat the size of the dataset as public information. We also demonstrate empirically that our proposed algorithm yields at least a factor of two improvement in mean squared error over algorithms frequently used in practice. One of our main technical contributions is a new hourglass mechanism, which might be of independent interest in other scenarios.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-kulesza24a, title = {Mean Estimation in the Add-Remove Model of Differential Privacy}, author = {Kulesza, Alex and Suresh, Ananda Theertha and Wang, Yuyan}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {25642--25661}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/kulesza24a/kulesza24a.pdf}, url = {https://proceedings.mlr.press/v235/kulesza24a.html}, abstract = {Differential privacy is often studied under two different models of neighboring datasets: the add-remove model and the swap model. While the swap model is frequently used in the academic literature to simplify analysis, many practical applications rely on the more conservative add-remove model, where obtaining tight results can be difficult. Here, we study the problem of one-dimensional mean estimation under the add-remove model. We propose a new algorithm and show that it is min-max optimal, achieving the best possible constant in the leading term of the mean squared error for all $\epsilon$, and that this constant is the same as the optimal algorithm under the swap model. These results show that the add-remove and swap models give nearly identical errors for mean estimation, even though the add-remove model cannot treat the size of the dataset as public information. We also demonstrate empirically that our proposed algorithm yields at least a factor of two improvement in mean squared error over algorithms frequently used in practice. One of our main technical contributions is a new hourglass mechanism, which might be of independent interest in other scenarios.} }
Endnote
%0 Conference Paper %T Mean Estimation in the Add-Remove Model of Differential Privacy %A Alex Kulesza %A Ananda Theertha Suresh %A Yuyan Wang %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-kulesza24a %I PMLR %P 25642--25661 %U https://proceedings.mlr.press/v235/kulesza24a.html %V 235 %X Differential privacy is often studied under two different models of neighboring datasets: the add-remove model and the swap model. While the swap model is frequently used in the academic literature to simplify analysis, many practical applications rely on the more conservative add-remove model, where obtaining tight results can be difficult. Here, we study the problem of one-dimensional mean estimation under the add-remove model. We propose a new algorithm and show that it is min-max optimal, achieving the best possible constant in the leading term of the mean squared error for all $\epsilon$, and that this constant is the same as the optimal algorithm under the swap model. These results show that the add-remove and swap models give nearly identical errors for mean estimation, even though the add-remove model cannot treat the size of the dataset as public information. We also demonstrate empirically that our proposed algorithm yields at least a factor of two improvement in mean squared error over algorithms frequently used in practice. One of our main technical contributions is a new hourglass mechanism, which might be of independent interest in other scenarios.
APA
Kulesza, A., Suresh, A.T. & Wang, Y.. (2024). Mean Estimation in the Add-Remove Model of Differential Privacy. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:25642-25661 Available from https://proceedings.mlr.press/v235/kulesza24a.html.

Related Material