Asymptotically Optimal Locally Private Heavy Hitters via Parameterized Sketches

Hao Wu, Anthony Wirth
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:7766-7798, 2022.

Abstract

We study the frequency estimation problem under the local differential privacy model. Frequency estimation is a fundamental computational question, and differential privacy has become the de-facto standard, with the local version (LDP) affording even greater protection. On large input domains, sketching methods and hierarchical search methods are commonly and successfully, in practice, applied for reducing the size of the domain, and for identifying frequent elements. It is therefore of interest whether the current theoretical analysis of such algorithms is tight, or whether we can obtain algorithms in a similar vein that achieve optimal error guarantee. We introduce two algorithms for LDP frequency estimation. One solves the fundamental frequency oracle problem; the other solves the well-known heavy hitters identification problem. As a function of failure probability, \ensuremath{\beta}, the former achieves optimal worst-case estimation error for every \ensuremath{\beta}; the latter is optimal when \ensuremath{\beta} is at least inverse polynomial in n, the number of users. In each algorithm, server running time and memory usage are tilde{O}(n) and tilde{O}(sqrt{n}), respectively, while user running time and memory usage are both tilde{O}(1). Our frequency-oracle algorithm achieves lower estimation error than Bassily et al. (NeurIPS 2017). On the other hand, our heavy hitters identification method improves the worst-case error of TreeHist (ibid) by a factor of Omega(sqrt{log n}); it avoids invoking error-correcting codes, known to be theoretically powerful, but yet to be implemented.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-wu22e, title = { Asymptotically Optimal Locally Private Heavy Hitters via Parameterized Sketches }, author = {Wu, Hao and Wirth, Anthony}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {7766--7798}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/wu22e/wu22e.pdf}, url = {https://proceedings.mlr.press/v151/wu22e.html}, abstract = { We study the frequency estimation problem under the local differential privacy model. Frequency estimation is a fundamental computational question, and differential privacy has become the de-facto standard, with the local version (LDP) affording even greater protection. On large input domains, sketching methods and hierarchical search methods are commonly and successfully, in practice, applied for reducing the size of the domain, and for identifying frequent elements. It is therefore of interest whether the current theoretical analysis of such algorithms is tight, or whether we can obtain algorithms in a similar vein that achieve optimal error guarantee. We introduce two algorithms for LDP frequency estimation. One solves the fundamental frequency oracle problem; the other solves the well-known heavy hitters identification problem. As a function of failure probability, \ensuremath{\beta}, the former achieves optimal worst-case estimation error for every \ensuremath{\beta}; the latter is optimal when \ensuremath{\beta} is at least inverse polynomial in n, the number of users. In each algorithm, server running time and memory usage are tilde{O}(n) and tilde{O}(sqrt{n}), respectively, while user running time and memory usage are both tilde{O}(1). Our frequency-oracle algorithm achieves lower estimation error than Bassily et al. (NeurIPS 2017). On the other hand, our heavy hitters identification method improves the worst-case error of TreeHist (ibid) by a factor of Omega(sqrt{log n}); it avoids invoking error-correcting codes, known to be theoretically powerful, but yet to be implemented. } }
Endnote
%0 Conference Paper %T Asymptotically Optimal Locally Private Heavy Hitters via Parameterized Sketches %A Hao Wu %A Anthony Wirth %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-wu22e %I PMLR %P 7766--7798 %U https://proceedings.mlr.press/v151/wu22e.html %V 151 %X We study the frequency estimation problem under the local differential privacy model. Frequency estimation is a fundamental computational question, and differential privacy has become the de-facto standard, with the local version (LDP) affording even greater protection. On large input domains, sketching methods and hierarchical search methods are commonly and successfully, in practice, applied for reducing the size of the domain, and for identifying frequent elements. It is therefore of interest whether the current theoretical analysis of such algorithms is tight, or whether we can obtain algorithms in a similar vein that achieve optimal error guarantee. We introduce two algorithms for LDP frequency estimation. One solves the fundamental frequency oracle problem; the other solves the well-known heavy hitters identification problem. As a function of failure probability, \ensuremath{\beta}, the former achieves optimal worst-case estimation error for every \ensuremath{\beta}; the latter is optimal when \ensuremath{\beta} is at least inverse polynomial in n, the number of users. In each algorithm, server running time and memory usage are tilde{O}(n) and tilde{O}(sqrt{n}), respectively, while user running time and memory usage are both tilde{O}(1). Our frequency-oracle algorithm achieves lower estimation error than Bassily et al. (NeurIPS 2017). On the other hand, our heavy hitters identification method improves the worst-case error of TreeHist (ibid) by a factor of Omega(sqrt{log n}); it avoids invoking error-correcting codes, known to be theoretically powerful, but yet to be implemented.
APA
Wu, H. & Wirth, A.. (2022). Asymptotically Optimal Locally Private Heavy Hitters via Parameterized Sketches . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:7766-7798 Available from https://proceedings.mlr.press/v151/wu22e.html.

Related Material