[edit]
Private Vector Mean Estimation in the Shuffle Model: Optimal Rates Require Many Messages
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:1945-1970, 2024.
Abstract
We study the problem of private vector mean estimation in the shuffle model of privacy where n users each have a unit vector v(i)∈Rd. We propose a new multi-message protocol that achieves the optimal error using O(min messages per user. Moreover, we show that any (unbiased) protocol that achieves optimal error must require each user to send \Omega(\min(n\varepsilon^2,d)/\log(n)) messages, demonstrating the optimality of our message complexity up to logarithmic factors. Additionally, we study the single-message setting and design a protocol that achieves mean squared error O(dn^{d/(d+2)}\varepsilon^{-4/(d+2)}). Moreover, we show that any single-message protocol must incur mean squared error \Omega(dn^{d/(d+2)}), showing that our protocol is optimal in the standard setting where \varepsilon = \Theta(1). Finally, we study robustness to malicious users and show that malicious users can incur large additive error with a single shuffler.