Privately Answering Classification Queries in the Agnostic PAC Model

Anupama Nandi, Raef Bassily
Proceedings of the 31st International Conference on Algorithmic Learning Theory, PMLR 117:687-703, 2020.

Abstract

We revisit the problem of differentially private release of classification queries. In this problem, the goal is to design an algorithm that can accurately answer a sequence of classification queries based on a private training set while ensuring differential privacy. We formally study this problem in the agnostic PAC model and derive a new upper bound on the private sample complexity. Our results improve over those obtained in a recent work (Bassily et al., 2018) for the agnostic PAC setting. In particular, we give an improved construction that yields a tighter upper bound on the sample complexity. Moreover, unlike (Bassily et al., 2018), our accuracy guarantee does not involve any blow-up in the approximation error associated with the given hypothesis class. Given any hypothesis class with VC-dimension d, we show that our construction can privately answer up to m classification queries with average excess error α using a private sample of size dα2max. Using recent results on private learning with auxiliary public data, we extend our construction to show that one can privately answer any number of classification queries with average excess error \alpha using a private sample of size \approx \frac{d}{\alpha^2}\,\max\left(1, \sqrt{d}\,\alpha\right). When \alpha=O\left(\frac{1}{\sqrt{d}}\right), our private sample complexity bound is essentially optimal.

Cite this Paper


BibTeX
@InProceedings{pmlr-v117-nandi20a, title = {Privately Answering Classification Queries in the Agnostic PAC Model}, author = {Nandi, Anupama and Bassily, Raef}, booktitle = {Proceedings of the 31st International Conference on Algorithmic Learning Theory}, pages = {687--703}, year = {2020}, editor = {Kontorovich, Aryeh and Neu, Gergely}, volume = {117}, series = {Proceedings of Machine Learning Research}, month = {08 Feb--11 Feb}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v117/nandi20a/nandi20a.pdf}, url = {https://proceedings.mlr.press/v117/nandi20a.html}, abstract = {We revisit the problem of differentially private release of classification queries. In this problem, the goal is to design an algorithm that can accurately answer a sequence of classification queries based on a private training set while ensuring differential privacy. We formally study this problem in the agnostic PAC model and derive a new upper bound on the private sample complexity. Our results improve over those obtained in a recent work (Bassily et al., 2018) for the agnostic PAC setting. In particular, we give an improved construction that yields a tighter upper bound on the sample complexity. Moreover, unlike (Bassily et al., 2018), our accuracy guarantee does not involve any blow-up in the approximation error associated with the given hypothesis class. Given any hypothesis class with VC-dimension $d$, we show that our construction can privately answer up to $m$ classification queries with average excess error $\alpha$ using a private sample of size $\approx \frac{d}{\alpha^2}\,\max\left(1, \sqrt{m}\,\alpha^{3/2}\right)$. Using recent results on private learning with auxiliary public data, we extend our construction to show that one can privately answer any number of classification queries with average excess error $\alpha$ using a private sample of size $\approx \frac{d}{\alpha^2}\,\max\left(1, \sqrt{d}\,\alpha\right)$. When $\alpha=O\left(\frac{1}{\sqrt{d}}\right)$, our private sample complexity bound is essentially optimal.} }
Endnote
%0 Conference Paper %T Privately Answering Classification Queries in the Agnostic PAC Model %A Anupama Nandi %A Raef Bassily %B Proceedings of the 31st International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2020 %E Aryeh Kontorovich %E Gergely Neu %F pmlr-v117-nandi20a %I PMLR %P 687--703 %U https://proceedings.mlr.press/v117/nandi20a.html %V 117 %X We revisit the problem of differentially private release of classification queries. In this problem, the goal is to design an algorithm that can accurately answer a sequence of classification queries based on a private training set while ensuring differential privacy. We formally study this problem in the agnostic PAC model and derive a new upper bound on the private sample complexity. Our results improve over those obtained in a recent work (Bassily et al., 2018) for the agnostic PAC setting. In particular, we give an improved construction that yields a tighter upper bound on the sample complexity. Moreover, unlike (Bassily et al., 2018), our accuracy guarantee does not involve any blow-up in the approximation error associated with the given hypothesis class. Given any hypothesis class with VC-dimension $d$, we show that our construction can privately answer up to $m$ classification queries with average excess error $\alpha$ using a private sample of size $\approx \frac{d}{\alpha^2}\,\max\left(1, \sqrt{m}\,\alpha^{3/2}\right)$. Using recent results on private learning with auxiliary public data, we extend our construction to show that one can privately answer any number of classification queries with average excess error $\alpha$ using a private sample of size $\approx \frac{d}{\alpha^2}\,\max\left(1, \sqrt{d}\,\alpha\right)$. When $\alpha=O\left(\frac{1}{\sqrt{d}}\right)$, our private sample complexity bound is essentially optimal.
APA
Nandi, A. & Bassily, R.. (2020). Privately Answering Classification Queries in the Agnostic PAC Model. Proceedings of the 31st International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 117:687-703 Available from https://proceedings.mlr.press/v117/nandi20a.html.

Related Material