Linear Queries Estimation with Local Differential Privacy
[edit]
Proceedings of Machine Learning Research, PMLR 89:721729, 2019.
Abstract
We study the problem of estimating a set of d linear queries with respect to some unknown distribution p over a domain $[J]$ based on a sensitive data set of n individuals under the constraint of local differential privacy. This problem subsumes a wide range of estimation tasks, e.g., distribution estimation and ddimensional mean estimation. We provide new algorithms for both the offline (nonadaptive) and adaptive versions of this problem. In the offline setting, the set of queries are fixed before the algorithm starts. In the regime where $n < d^2/\log(J)$, our algorithms attain $L_2$ estimation error that is independent of d. For the special case of distribution estimation, we show that projecting the output estimate of an algorithm due to [Acharya et al. 2018] on the probability simplex yields an $L_2$ error that depends only sublogarithmically on $J$ in the regime where $n < J^2/\log(J)$. Our bounds are within a factor of at most $(\log(J))^{1/4}$ from the optimal $L_2$ error. These results show the possibility of accurate estimation of linear queries in the highdimensional settings under the $L_2$ error criterion. In the adaptive setting, the queries are generated over d rounds; one query at a time. In each round, a query can be chosen adaptively based on all the history of previous queries and answers. We give an algorithm for this problem with optimal $L_{\infty}$ estimation error (worst error in the estimated values for the queries w.r.t. the data distribution). Our bound matches a lower bound on the $L_{\infty}$ error in the offline version of this problem [Duchi et al. 2013].
Related Material


