Locally Private Hypothesis Selection

Sivakanth Gopi, Gautam Kamath, Janardhan Kulkarni, Aleksandar Nikolov, Zhiwei Steven Wu, Huanyu Zhang
Proceedings of Thirty Third Conference on Learning Theory, PMLR 125:1785-1816, 2020.

Abstract

We initiate the study of hypothesis selection under local differential privacy. Given samples from an unknown probability distribution p and a set of k probability distributions Q, we aim to output, under the constraints of ε-differential privacy, a distribution from Q whose total variation distance to p is comparable to the best such distribution. This is a generalization of the classic problem of k-wise simple hypothesis testing, which corresponds to when pQ, and we wish to identify p. Absent privacy constraints, this problem requires O(logk) samples from p, and it was recently shown that the same complexity is achievable under (central) differential privacy. However, the naive approach to this problem under local differential privacy would require ˜O(k2) samples. We first show that the constraint of local differential privacy incurs an exponential increase in cost: any algorithm for this problem requires at least Ω(k) samples. Second, for the special case of k-wise simple hypothesis testing, we provide a non-interactive algorithm which nearly matches this bound, requiring ˜O(k) samples. Finally, we provide sequentially interactive algorithms for the general case, requiring ˜O(k) samples and only O(loglogk) rounds of interactivity. Our algorithms are achieved through a reduction to maximum selection with adversarial comparators, a problem of independent interest for which we initiate study in the parallel setting. For this problem, we provide a family of algorithms for each number of allowed rounds of interaction t, as well as lower bounds showing that they are near-optimal for every t. Notably, our algorithms result in exponential improvements on the round complexity of previous methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v125-gopi20a, title = {Locally Private Hypothesis Selection}, author = {Gopi, Sivakanth and Kamath, Gautam and Kulkarni, Janardhan and Nikolov, Aleksandar and Wu, Zhiwei Steven and Zhang, Huanyu}, booktitle = {Proceedings of Thirty Third Conference on Learning Theory}, pages = {1785--1816}, year = {2020}, editor = {Abernethy, Jacob and Agarwal, Shivani}, volume = {125}, series = {Proceedings of Machine Learning Research}, month = {09--12 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v125/gopi20a/gopi20a.pdf}, url = {https://proceedings.mlr.press/v125/gopi20a.html}, abstract = { We initiate the study of hypothesis selection under local differential privacy. Given samples from an unknown probability distribution $p$ and a set of $k$ probability distributions $\mathcal{Q}$, we aim to output, under the constraints of $\varepsilon$-differential privacy, a distribution from $\mathcal{Q}$ whose total variation distance to $p$ is comparable to the best such distribution. This is a generalization of the classic problem of $k$-wise simple hypothesis testing, which corresponds to when $p \in \mathcal{Q}$, and we wish to identify $p$. Absent privacy constraints, this problem requires $O(\log k)$ samples from $p$, and it was recently shown that the same complexity is achievable under (central) differential privacy. However, the naive approach to this problem under local differential privacy would require $\tilde O(k^2)$ samples. We first show that the constraint of local differential privacy incurs an exponential increase in cost: any algorithm for this problem requires at least $\Omega(k)$ samples. Second, for the special case of $k$-wise simple hypothesis testing, we provide a non-interactive algorithm which nearly matches this bound, requiring $\tilde O(k)$ samples. Finally, we provide sequentially interactive algorithms for the general case, requiring $\tilde O(k)$ samples and only $O(\log \log k)$ rounds of interactivity. Our algorithms are achieved through a reduction to maximum selection with adversarial comparators, a problem of independent interest for which we initiate study in the parallel setting. For this problem, we provide a family of algorithms for each number of allowed rounds of interaction $t$, as well as lower bounds showing that they are near-optimal for every $t$. Notably, our algorithms result in exponential improvements on the round complexity of previous methods.} }
Endnote
%0 Conference Paper %T Locally Private Hypothesis Selection %A Sivakanth Gopi %A Gautam Kamath %A Janardhan Kulkarni %A Aleksandar Nikolov %A Zhiwei Steven Wu %A Huanyu Zhang %B Proceedings of Thirty Third Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2020 %E Jacob Abernethy %E Shivani Agarwal %F pmlr-v125-gopi20a %I PMLR %P 1785--1816 %U https://proceedings.mlr.press/v125/gopi20a.html %V 125 %X We initiate the study of hypothesis selection under local differential privacy. Given samples from an unknown probability distribution $p$ and a set of $k$ probability distributions $\mathcal{Q}$, we aim to output, under the constraints of $\varepsilon$-differential privacy, a distribution from $\mathcal{Q}$ whose total variation distance to $p$ is comparable to the best such distribution. This is a generalization of the classic problem of $k$-wise simple hypothesis testing, which corresponds to when $p \in \mathcal{Q}$, and we wish to identify $p$. Absent privacy constraints, this problem requires $O(\log k)$ samples from $p$, and it was recently shown that the same complexity is achievable under (central) differential privacy. However, the naive approach to this problem under local differential privacy would require $\tilde O(k^2)$ samples. We first show that the constraint of local differential privacy incurs an exponential increase in cost: any algorithm for this problem requires at least $\Omega(k)$ samples. Second, for the special case of $k$-wise simple hypothesis testing, we provide a non-interactive algorithm which nearly matches this bound, requiring $\tilde O(k)$ samples. Finally, we provide sequentially interactive algorithms for the general case, requiring $\tilde O(k)$ samples and only $O(\log \log k)$ rounds of interactivity. Our algorithms are achieved through a reduction to maximum selection with adversarial comparators, a problem of independent interest for which we initiate study in the parallel setting. For this problem, we provide a family of algorithms for each number of allowed rounds of interaction $t$, as well as lower bounds showing that they are near-optimal for every $t$. Notably, our algorithms result in exponential improvements on the round complexity of previous methods.
APA
Gopi, S., Kamath, G., Kulkarni, J., Nikolov, A., Wu, Z.S. & Zhang, H.. (2020). Locally Private Hypothesis Selection. Proceedings of Thirty Third Conference on Learning Theory, in Proceedings of Machine Learning Research 125:1785-1816 Available from https://proceedings.mlr.press/v125/gopi20a.html.

Related Material