[edit]

# Tight and Robust Private Mean Estimation with Few Users

*Proceedings of the 39th International Conference on Machine Learning*, PMLR 162:16383-16412, 2022.

#### Abstract

In this work, we study high-dimensional mean estimation under user-level differential privacy, and design an $(\varepsilon,\delta)$-differentially private mechanism using as few users as possible. In particular, we provide a nearly optimal trade-off between the number of users and the number of samples per user required for private mean estimation, even when the number of users is as low as $O(\frac{1}{\varepsilon}\log\frac{1}{\delta})$. Interestingly, this bound on the number of

*users*is independent of the dimension (though the number of*samples per user*is allowed to depend polynomially on the dimension), unlike the previous work that requires the number of users to depend polynomially on the dimension. This resolves a problem first proposed by Amin et al. (2019). Moreover, our mechanism is robust against corruptions in up to $49%$ of the users. Finally, our results also apply to optimal algorithms for privately learning discrete distributions with few users, answering a question of Liu et al. (2020), and a broader range of problems such as stochastic convex optimization and a variant of stochastic gradient descent via a reduction to differentially private mean estimation.