Tight and Robust Private Mean Estimation with Few Users

Shyam Narayanan, Vahab Mirrokni, Hossein Esfandiari
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:16383-16412, 2022.

Abstract

In this work, we study high-dimensional mean estimation under user-level differential privacy, and design an $(\varepsilon,\delta)$-differentially private mechanism using as few users as possible. In particular, we provide a nearly optimal trade-off between the number of users and the number of samples per user required for private mean estimation, even when the number of users is as low as $O(\frac{1}{\varepsilon}\log\frac{1}{\delta})$. Interestingly, this bound on the number of users is independent of the dimension (though the number of samples per user is allowed to depend polynomially on the dimension), unlike the previous work that requires the number of users to depend polynomially on the dimension. This resolves a problem first proposed by Amin et al. (2019). Moreover, our mechanism is robust against corruptions in up to $49%$ of the users. Finally, our results also apply to optimal algorithms for privately learning discrete distributions with few users, answering a question of Liu et al. (2020), and a broader range of problems such as stochastic convex optimization and a variant of stochastic gradient descent via a reduction to differentially private mean estimation.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-narayanan22a, title = {Tight and Robust Private Mean Estimation with Few Users}, author = {Narayanan, Shyam and Mirrokni, Vahab and Esfandiari, Hossein}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {16383--16412}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/narayanan22a/narayanan22a.pdf}, url = {https://proceedings.mlr.press/v162/narayanan22a.html}, abstract = {In this work, we study high-dimensional mean estimation under user-level differential privacy, and design an $(\varepsilon,\delta)$-differentially private mechanism using as few users as possible. In particular, we provide a nearly optimal trade-off between the number of users and the number of samples per user required for private mean estimation, even when the number of users is as low as $O(\frac{1}{\varepsilon}\log\frac{1}{\delta})$. Interestingly, this bound on the number of users is independent of the dimension (though the number of samples per user is allowed to depend polynomially on the dimension), unlike the previous work that requires the number of users to depend polynomially on the dimension. This resolves a problem first proposed by Amin et al. (2019). Moreover, our mechanism is robust against corruptions in up to $49%$ of the users. Finally, our results also apply to optimal algorithms for privately learning discrete distributions with few users, answering a question of Liu et al. (2020), and a broader range of problems such as stochastic convex optimization and a variant of stochastic gradient descent via a reduction to differentially private mean estimation.} }
Endnote
%0 Conference Paper %T Tight and Robust Private Mean Estimation with Few Users %A Shyam Narayanan %A Vahab Mirrokni %A Hossein Esfandiari %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-narayanan22a %I PMLR %P 16383--16412 %U https://proceedings.mlr.press/v162/narayanan22a.html %V 162 %X In this work, we study high-dimensional mean estimation under user-level differential privacy, and design an $(\varepsilon,\delta)$-differentially private mechanism using as few users as possible. In particular, we provide a nearly optimal trade-off between the number of users and the number of samples per user required for private mean estimation, even when the number of users is as low as $O(\frac{1}{\varepsilon}\log\frac{1}{\delta})$. Interestingly, this bound on the number of users is independent of the dimension (though the number of samples per user is allowed to depend polynomially on the dimension), unlike the previous work that requires the number of users to depend polynomially on the dimension. This resolves a problem first proposed by Amin et al. (2019). Moreover, our mechanism is robust against corruptions in up to $49%$ of the users. Finally, our results also apply to optimal algorithms for privately learning discrete distributions with few users, answering a question of Liu et al. (2020), and a broader range of problems such as stochastic convex optimization and a variant of stochastic gradient descent via a reduction to differentially private mean estimation.
APA
Narayanan, S., Mirrokni, V. & Esfandiari, H.. (2022). Tight and Robust Private Mean Estimation with Few Users. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:16383-16412 Available from https://proceedings.mlr.press/v162/narayanan22a.html.

Related Material