Hadamard Response: Estimating Distributions Privately, Efficiently, and with Little Communication

Jayadev Acharya, Ziteng Sun, Huanyu Zhang
Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR 89:1120-1129, 2019.

Abstract

We study the problem of estimating $k$-ary distributions under $\eps$-local differential privacy. $n$ samples are distributed across users who send privatized versions of their sample to a central server. All previously known sample optimal algorithms require linear (in $k$) communication from each user in the high privacy regime $(\eps=O(1))$, and run in time that grows as $n\cdot k$, which can be prohibitive for large domain size $k$. We propose Hadamard Response (HR), a local privatization scheme that requires no shared randomness and is symmetric with respect to the users. Our scheme has order optimal sample complexity for all $\eps$, a communication of at most $\log k+2$ bits per user, and nearly linear running time of $\tilde{O}(n + k)$. Our encoding and decoding are based on Hadamard matrices and are simple to implement. The statistical performance relies on the coding theoretic aspects of Hadamard matrices, ie, the large Hamming distance between the rows. An efficient implementation of the algorithm using the Fast Walsh-Hadamard transform gives the computational gains. We compare our approach with Randomized Response (RR), RAPPOR, and subset-selection mechanisms (SS), both theoretically, and experimentally. For $k=10000$, our algorithm runs about 100x faster than SS, and RAPPOR.

Cite this Paper


BibTeX
@InProceedings{pmlr-v89-acharya19a, title = {Hadamard Response: Estimating Distributions Privately, Efficiently, and with Little Communication}, author = {Acharya, Jayadev and Sun, Ziteng and Zhang, Huanyu}, booktitle = {Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics}, pages = {1120--1129}, year = {2019}, editor = {Chaudhuri, Kamalika and Sugiyama, Masashi}, volume = {89}, series = {Proceedings of Machine Learning Research}, month = {16--18 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v89/acharya19a/acharya19a.pdf}, url = {https://proceedings.mlr.press/v89/acharya19a.html}, abstract = {We study the problem of estimating $k$-ary distributions under $\eps$-local differential privacy. $n$ samples are distributed across users who send privatized versions of their sample to a central server. All previously known sample optimal algorithms require linear (in $k$) communication from each user in the high privacy regime $(\eps=O(1))$, and run in time that grows as $n\cdot k$, which can be prohibitive for large domain size $k$. We propose Hadamard Response (HR), a local privatization scheme that requires no shared randomness and is symmetric with respect to the users. Our scheme has order optimal sample complexity for all $\eps$, a communication of at most $\log k+2$ bits per user, and nearly linear running time of $\tilde{O}(n + k)$. Our encoding and decoding are based on Hadamard matrices and are simple to implement. The statistical performance relies on the coding theoretic aspects of Hadamard matrices, ie, the large Hamming distance between the rows. An efficient implementation of the algorithm using the Fast Walsh-Hadamard transform gives the computational gains. We compare our approach with Randomized Response (RR), RAPPOR, and subset-selection mechanisms (SS), both theoretically, and experimentally. For $k=10000$, our algorithm runs about 100x faster than SS, and RAPPOR.} }
Endnote
%0 Conference Paper %T Hadamard Response: Estimating Distributions Privately, Efficiently, and with Little Communication %A Jayadev Acharya %A Ziteng Sun %A Huanyu Zhang %B Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Masashi Sugiyama %F pmlr-v89-acharya19a %I PMLR %P 1120--1129 %U https://proceedings.mlr.press/v89/acharya19a.html %V 89 %X We study the problem of estimating $k$-ary distributions under $\eps$-local differential privacy. $n$ samples are distributed across users who send privatized versions of their sample to a central server. All previously known sample optimal algorithms require linear (in $k$) communication from each user in the high privacy regime $(\eps=O(1))$, and run in time that grows as $n\cdot k$, which can be prohibitive for large domain size $k$. We propose Hadamard Response (HR), a local privatization scheme that requires no shared randomness and is symmetric with respect to the users. Our scheme has order optimal sample complexity for all $\eps$, a communication of at most $\log k+2$ bits per user, and nearly linear running time of $\tilde{O}(n + k)$. Our encoding and decoding are based on Hadamard matrices and are simple to implement. The statistical performance relies on the coding theoretic aspects of Hadamard matrices, ie, the large Hamming distance between the rows. An efficient implementation of the algorithm using the Fast Walsh-Hadamard transform gives the computational gains. We compare our approach with Randomized Response (RR), RAPPOR, and subset-selection mechanisms (SS), both theoretically, and experimentally. For $k=10000$, our algorithm runs about 100x faster than SS, and RAPPOR.
APA
Acharya, J., Sun, Z. & Zhang, H.. (2019). Hadamard Response: Estimating Distributions Privately, Efficiently, and with Little Communication. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 89:1120-1129 Available from https://proceedings.mlr.press/v89/acharya19a.html.

Related Material