Mean Estimation in the Add-Remove Model of Differential Privacy

Alex Kulesza, Ananda Theertha Suresh, Yuyan Wang
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:25642-25661, 2024.

Abstract

Differential privacy is often studied under two different models of neighboring datasets: the add-remove model and the swap model. While the swap model is frequently used in the academic literature to simplify analysis, many practical applications rely on the more conservative add-remove model, where obtaining tight results can be difficult. Here, we study the problem of one-dimensional mean estimation under the add-remove model. We propose a new algorithm and show that it is min-max optimal, achieving the best possible constant in the leading term of the mean squared error for all $\epsilon$, and that this constant is the same as the optimal algorithm under the swap model. These results show that the add-remove and swap models give nearly identical errors for mean estimation, even though the add-remove model cannot treat the size of the dataset as public information. We also demonstrate empirically that our proposed algorithm yields at least a factor of two improvement in mean squared error over algorithms frequently used in practice. One of our main technical contributions is a new hourglass mechanism, which might be of independent interest in other scenarios.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-kulesza24a, title = {Mean Estimation in the Add-Remove Model of Differential Privacy}, author = {Kulesza, Alex and Suresh, Ananda Theertha and Wang, Yuyan}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {25642--25661}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/kulesza24a/kulesza24a.pdf}, url = {https://proceedings.mlr.press/v235/kulesza24a.html}, abstract = {Differential privacy is often studied under two different models of neighboring datasets: the add-remove model and the swap model. While the swap model is frequently used in the academic literature to simplify analysis, many practical applications rely on the more conservative add-remove model, where obtaining tight results can be difficult. Here, we study the problem of one-dimensional mean estimation under the add-remove model. We propose a new algorithm and show that it is min-max optimal, achieving the best possible constant in the leading term of the mean squared error for all $\epsilon$, and that this constant is the same as the optimal algorithm under the swap model. These results show that the add-remove and swap models give nearly identical errors for mean estimation, even though the add-remove model cannot treat the size of the dataset as public information. We also demonstrate empirically that our proposed algorithm yields at least a factor of two improvement in mean squared error over algorithms frequently used in practice. One of our main technical contributions is a new hourglass mechanism, which might be of independent interest in other scenarios.} }
Endnote
%0 Conference Paper %T Mean Estimation in the Add-Remove Model of Differential Privacy %A Alex Kulesza %A Ananda Theertha Suresh %A Yuyan Wang %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-kulesza24a %I PMLR %P 25642--25661 %U https://proceedings.mlr.press/v235/kulesza24a.html %V 235 %X Differential privacy is often studied under two different models of neighboring datasets: the add-remove model and the swap model. While the swap model is frequently used in the academic literature to simplify analysis, many practical applications rely on the more conservative add-remove model, where obtaining tight results can be difficult. Here, we study the problem of one-dimensional mean estimation under the add-remove model. We propose a new algorithm and show that it is min-max optimal, achieving the best possible constant in the leading term of the mean squared error for all $\epsilon$, and that this constant is the same as the optimal algorithm under the swap model. These results show that the add-remove and swap models give nearly identical errors for mean estimation, even though the add-remove model cannot treat the size of the dataset as public information. We also demonstrate empirically that our proposed algorithm yields at least a factor of two improvement in mean squared error over algorithms frequently used in practice. One of our main technical contributions is a new hourglass mechanism, which might be of independent interest in other scenarios.
APA
Kulesza, A., Suresh, A.T. & Wang, Y.. (2024). Mean Estimation in the Add-Remove Model of Differential Privacy. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:25642-25661 Available from https://proceedings.mlr.press/v235/kulesza24a.html.

Related Material