Lower Bounds for Differential Privacy Under Continual Observation and Online Threshold Queries

Edith Cohen; Xin Lyu; Jelani Nelson; Tamás Sarlós; Uri Stemmer

Lower Bounds for Differential Privacy Under Continual Observation and Online Threshold Queries

Edith Cohen, Xin Lyu, Jelani Nelson, Tamás Sarlós, Uri Stemmer

Proceedings of Thirty Seventh Conference on Learning Theory, PMLR 247:1200-1222, 2024.

Abstract

One of the most basic problems for studying the “price of privacy over time” is the so called {\em private counter problem}, introduced by Dwork et al. (2010) and Chan et al. (2011). In this problem, we aim to track the number of {\em events} that occur over time, while hiding the existence of every single event. More specifically, in every time step

$t\in[T]$ we learn (in an online fashion) that

$\Delta_t\geq 0$ new events have occurred, and must respond with an estimate

$n_t\approx\sum_{j=1}^t \Delta_j$ . The privacy requirement is that {\em all of the outputs together}, across all time steps, satisfy {\em event level} differential privacy. The main question here is how our error needs to depend on the total number of time steps

$T$ and the total number of events

$n$ . Dwork et al. (2015) showed an upper bound of

$O\left(\log(T)+\log^2(n)\right)$ , and Henzinger et al. (2023) showed a lower bound of

$\Omega\left( \min\{\log n, \log T\} \right)$ . We show a new lower bound of

$\Omega\left(\min\{n,\log T\}\right)$ , which is tight w.r.t. the dependence on

$T$ , and is tight in the sparse case where

$\log^2 n=O(\log T)$ . Our lower bound has the following implications:

$\begin{itemize} \item We show that our lower bound extends to the {\em online thresholds} problem, where the goal is to privately answer many “quantile queries” when these queries are presented one-by-one. This resolves an open question of Bun et al. (2017). \item Our lower bound implies, for the first time, a separation between the number of mistakes obtainable by a private online learner and a non-private online learner. This partially resolves a COLT’22 open question published by Sanyal and Ramponi. \item Our lower bound also yields the first separation between the standard model of private online learning and a recently proposed relaxed variant of it, called {\em private online prediction}. \end{itemize}$

Cite this Paper

BibTeX


@InProceedings{pmlr-v247-cohen24b,
  title = 	 {Lower Bounds for Differential Privacy Under Continual Observation and Online Threshold Queries},
  author =       {Cohen, Edith and Lyu, Xin and Nelson, Jelani and Sarl\'{o}s, Tam\'{a}s and Stemmer, Uri},
  booktitle = 	 {Proceedings of Thirty Seventh Conference on Learning Theory},
  pages = 	 {1200--1222},
  year = 	 {2024},
  editor = 	 {Agrawal, Shipra and Roth, Aaron},
  volume = 	 {247},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {30 Jun--03 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v247/cohen24b/cohen24b.pdf},
  url = 	 {https://proceedings.mlr.press/v247/cohen24b.html},
  abstract = 	 {One of the most basic problems for studying the “price of privacy over time” is the so called {\em private counter problem}, introduced by Dwork et al. (2010) and Chan et al. (2011).  In this problem, we aim to track the number of {\em events} that occur over time, while hiding the existence of every single event. More specifically, in every time step $t\in[T]$ we learn (in an online fashion) that $\Delta_t\geq 0$ new events have occurred, and must respond with an estimate $n_t\approx\sum_{j=1}^t \Delta_j$. The privacy requirement is that {\em all of the outputs together}, across all time steps, satisfy {\em event level} differential privacy.  The main question here is how our error needs to depend on the total number of time steps $T$ and the total number of events $n$. Dwork et al. (2015) showed an upper bound of $O\left(\log(T)+\log^2(n)\right)$, and Henzinger et al. (2023) showed a lower bound of $\Omega\left( \min\{\log n, \log T\} \right)$. We show a new lower bound of $\Omega\left(\min\{n,\log T\}\right)$, which is tight w.r.t. the dependence on $T$, and is tight in the sparse case where $\log^2 n=O(\log T)$. Our lower bound has the following implications: \begin{itemize} \item We show that our lower bound extends to the {\em online thresholds} problem, where the goal is to privately answer many “quantile queries” when these queries are presented one-by-one. This resolves an open question of Bun et al. (2017). \item Our lower bound implies, for the first time, a separation between the number of mistakes obtainable by a private online learner and a non-private online learner. This partially resolves a COLT’22 open question published by Sanyal and Ramponi. \item Our lower bound also yields the first separation between the standard model of private online learning and a recently proposed relaxed variant of it, called {\em private online prediction}. \end{itemize} }
}

Endnote

%0 Conference Paper
%T Lower Bounds for Differential Privacy Under Continual Observation and Online Threshold Queries
%A Edith Cohen
%A Xin Lyu
%A Jelani Nelson
%A Tamás Sarlós
%A Uri Stemmer
%B Proceedings of Thirty Seventh Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2024
%E Shipra Agrawal
%E Aaron Roth	
%F pmlr-v247-cohen24b
%I PMLR
%P 1200--1222
%U https://proceedings.mlr.press/v247/cohen24b.html
%V 247
%X One of the most basic problems for studying the “price of privacy over time” is the so called {\em private counter problem}, introduced by Dwork et al. (2010) and Chan et al. (2011).  In this problem, we aim to track the number of {\em events} that occur over time, while hiding the existence of every single event. More specifically, in every time step $t\in[T]$ we learn (in an online fashion) that $\Delta_t\geq 0$ new events have occurred, and must respond with an estimate $n_t\approx\sum_{j=1}^t \Delta_j$. The privacy requirement is that {\em all of the outputs together}, across all time steps, satisfy {\em event level} differential privacy.  The main question here is how our error needs to depend on the total number of time steps $T$ and the total number of events $n$. Dwork et al. (2015) showed an upper bound of $O\left(\log(T)+\log^2(n)\right)$, and Henzinger et al. (2023) showed a lower bound of $\Omega\left( \min\{\log n, \log T\} \right)$. We show a new lower bound of $\Omega\left(\min\{n,\log T\}\right)$, which is tight w.r.t. the dependence on $T$, and is tight in the sparse case where $\log^2 n=O(\log T)$. Our lower bound has the following implications: \begin{itemize} \item We show that our lower bound extends to the {\em online thresholds} problem, where the goal is to privately answer many “quantile queries” when these queries are presented one-by-one. This resolves an open question of Bun et al. (2017). \item Our lower bound implies, for the first time, a separation between the number of mistakes obtainable by a private online learner and a non-private online learner. This partially resolves a COLT’22 open question published by Sanyal and Ramponi. \item Our lower bound also yields the first separation between the standard model of private online learning and a recently proposed relaxed variant of it, called {\em private online prediction}. \end{itemize}

APA


Cohen, E., Lyu, X., Nelson, J., Sarlós, T. & Stemmer, U.. (2024). Lower Bounds for Differential Privacy Under Continual Observation and Online Threshold Queries. Proceedings of Thirty Seventh Conference on Learning Theory, in Proceedings of Machine Learning Research 247:1200-1222 Available from https://proceedings.mlr.press/v247/cohen24b.html.

Related Material

Download PDF