Differentially Private Quantiles

Jennifer Gillenwater, Matthew Joseph, Alex Kulesza
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:3713-3722, 2021.

Abstract

Quantiles are often used for summarizing and understanding data. If that data is sensitive, it may be necessary to compute quantiles in a way that is differentially private, providing theoretical guarantees that the result does not reveal private information. However, when multiple quantiles are needed, existing differentially private algorithms fare poorly: they either compute quantiles individually, splitting the privacy budget, or summarize the entire distribution, wasting effort. In either case the result is reduced accuracy. In this work we propose an instance of the exponential mechanism that simultaneously estimates exactly $m$ quantiles from $n$ data points while guaranteeing differential privacy. The utility function is carefully structured to allow for an efficient implementation that returns estimates of all $m$ quantiles in time $O(mn\log(n) + m^2n)$. Experiments show that our method significantly outperforms the current state of the art on both real and synthetic data while remaining efficient enough to be practical.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-gillenwater21a, title = {Differentially Private Quantiles}, author = {Gillenwater, Jennifer and Joseph, Matthew and Kulesza, Alex}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {3713--3722}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/gillenwater21a/gillenwater21a.pdf}, url = {https://proceedings.mlr.press/v139/gillenwater21a.html}, abstract = {Quantiles are often used for summarizing and understanding data. If that data is sensitive, it may be necessary to compute quantiles in a way that is differentially private, providing theoretical guarantees that the result does not reveal private information. However, when multiple quantiles are needed, existing differentially private algorithms fare poorly: they either compute quantiles individually, splitting the privacy budget, or summarize the entire distribution, wasting effort. In either case the result is reduced accuracy. In this work we propose an instance of the exponential mechanism that simultaneously estimates exactly $m$ quantiles from $n$ data points while guaranteeing differential privacy. The utility function is carefully structured to allow for an efficient implementation that returns estimates of all $m$ quantiles in time $O(mn\log(n) + m^2n)$. Experiments show that our method significantly outperforms the current state of the art on both real and synthetic data while remaining efficient enough to be practical.} }
Endnote
%0 Conference Paper %T Differentially Private Quantiles %A Jennifer Gillenwater %A Matthew Joseph %A Alex Kulesza %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-gillenwater21a %I PMLR %P 3713--3722 %U https://proceedings.mlr.press/v139/gillenwater21a.html %V 139 %X Quantiles are often used for summarizing and understanding data. If that data is sensitive, it may be necessary to compute quantiles in a way that is differentially private, providing theoretical guarantees that the result does not reveal private information. However, when multiple quantiles are needed, existing differentially private algorithms fare poorly: they either compute quantiles individually, splitting the privacy budget, or summarize the entire distribution, wasting effort. In either case the result is reduced accuracy. In this work we propose an instance of the exponential mechanism that simultaneously estimates exactly $m$ quantiles from $n$ data points while guaranteeing differential privacy. The utility function is carefully structured to allow for an efficient implementation that returns estimates of all $m$ quantiles in time $O(mn\log(n) + m^2n)$. Experiments show that our method significantly outperforms the current state of the art on both real and synthetic data while remaining efficient enough to be practical.
APA
Gillenwater, J., Joseph, M. & Kulesza, A.. (2021). Differentially Private Quantiles. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:3713-3722 Available from https://proceedings.mlr.press/v139/gillenwater21a.html.

Related Material