PREM: Privately Answering Statistical Queries with Relative Error

Badih Ghazi, Cristóbal Guzmán, Pritish Kamath, Alexander Knop, Ravi Kumar, Pasin Manurangsi, Sushant Sachdeva
Proceedings of Thirty Eighth Conference on Learning Theory, PMLR 291:2460-2460, 2025.

Abstract

We introduce $\mathsf{PREM}$ (Private Relative Error Multiplicative weight update), a new framework for generating synthetic data that achieves a {\em relative} error guarantee for statistical queries under $(\varepsilon, \delta)$-differential privacy (DP). Namely, for a domain ${\cal X}$, a family ${\cal F}$ of queries $f : {\cal X} \to \{0, 1\}$, and $\zeta > 0$, our framework yields a mechanism that on input dataset $D \in {\cal X}^n$ outputs a synthetic dataset $\widehat{D} \in {\cal X}^n$ such that all statistical queries in ${\cal F}$ on $D$, namely $\sum_{x \in D} f(x)$ for $f \in {\cal F}$, are within a $1 \pm \zeta$ {\em multiplicative} factor of the corresponding value on $\widehat{D}$ up to an {\em additive} error that is polynomial in $\log |{\cal F}|$, $\log |{\cal X}|$, $\log n$, $\log(1/\delta)$, $1/\varepsilon$, and $1/\zeta$. In contrast, any $(\varepsilon, \delta)$-DP mechanism is known to require worst-case additive error that is polynomial in at least one of $n, |{\cal F}|$, or $|{\cal X}|$. We complement our algorithm with nearly matching lower bounds.

Cite this Paper


BibTeX
@InProceedings{pmlr-v291-ghazi25a, title = {PREM: Privately Answering Statistical Queries with Relative Error}, author = {Ghazi, Badih and Guzm\'{a}n, Crist\'{o}bal and Kamath, Pritish and Knop, Alexander and Kumar, Ravi and Manurangsi, Pasin and Sachdeva, Sushant}, booktitle = {Proceedings of Thirty Eighth Conference on Learning Theory}, pages = {2460--2460}, year = {2025}, editor = {Haghtalab, Nika and Moitra, Ankur}, volume = {291}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--04 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v291/main/assets/ghazi25a/ghazi25a.pdf}, url = {https://proceedings.mlr.press/v291/ghazi25a.html}, abstract = {We introduce $\mathsf{PREM}$ (Private Relative Error Multiplicative weight update), a new framework for generating synthetic data that achieves a {\em relative} error guarantee for statistical queries under $(\varepsilon, \delta)$-differential privacy (DP). Namely, for a domain ${\cal X}$, a family ${\cal F}$ of queries $f : {\cal X} \to \{0, 1\}$, and $\zeta > 0$, our framework yields a mechanism that on input dataset $D \in {\cal X}^n$ outputs a synthetic dataset $\widehat{D} \in {\cal X}^n$ such that all statistical queries in ${\cal F}$ on $D$, namely $\sum_{x \in D} f(x)$ for $f \in {\cal F}$, are within a $1 \pm \zeta$ {\em multiplicative} factor of the corresponding value on $\widehat{D}$ up to an {\em additive} error that is polynomial in $\log |{\cal F}|$, $\log |{\cal X}|$, $\log n$, $\log(1/\delta)$, $1/\varepsilon$, and $1/\zeta$. In contrast, any $(\varepsilon, \delta)$-DP mechanism is known to require worst-case additive error that is polynomial in at least one of $n, |{\cal F}|$, or $|{\cal X}|$. We complement our algorithm with nearly matching lower bounds.} }
Endnote
%0 Conference Paper %T PREM: Privately Answering Statistical Queries with Relative Error %A Badih Ghazi %A Cristóbal Guzmán %A Pritish Kamath %A Alexander Knop %A Ravi Kumar %A Pasin Manurangsi %A Sushant Sachdeva %B Proceedings of Thirty Eighth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2025 %E Nika Haghtalab %E Ankur Moitra %F pmlr-v291-ghazi25a %I PMLR %P 2460--2460 %U https://proceedings.mlr.press/v291/ghazi25a.html %V 291 %X We introduce $\mathsf{PREM}$ (Private Relative Error Multiplicative weight update), a new framework for generating synthetic data that achieves a {\em relative} error guarantee for statistical queries under $(\varepsilon, \delta)$-differential privacy (DP). Namely, for a domain ${\cal X}$, a family ${\cal F}$ of queries $f : {\cal X} \to \{0, 1\}$, and $\zeta > 0$, our framework yields a mechanism that on input dataset $D \in {\cal X}^n$ outputs a synthetic dataset $\widehat{D} \in {\cal X}^n$ such that all statistical queries in ${\cal F}$ on $D$, namely $\sum_{x \in D} f(x)$ for $f \in {\cal F}$, are within a $1 \pm \zeta$ {\em multiplicative} factor of the corresponding value on $\widehat{D}$ up to an {\em additive} error that is polynomial in $\log |{\cal F}|$, $\log |{\cal X}|$, $\log n$, $\log(1/\delta)$, $1/\varepsilon$, and $1/\zeta$. In contrast, any $(\varepsilon, \delta)$-DP mechanism is known to require worst-case additive error that is polynomial in at least one of $n, |{\cal F}|$, or $|{\cal X}|$. We complement our algorithm with nearly matching lower bounds.
APA
Ghazi, B., Guzmán, C., Kamath, P., Knop, A., Kumar, R., Manurangsi, P. & Sachdeva, S.. (2025). PREM: Privately Answering Statistical Queries with Relative Error. Proceedings of Thirty Eighth Conference on Learning Theory, in Proceedings of Machine Learning Research 291:2460-2460 Available from https://proceedings.mlr.press/v291/ghazi25a.html.

Related Material