Nearly Linear-Time User-Level DP-SCO with Optimal Rates

Badih Ghazi; Ravi Kumar; Daogao Liu; Pasin Manurangsi

Nearly Linear-Time User-Level DP-SCO with Optimal Rates

Badih Ghazi, Ravi Kumar, Daogao Liu, Pasin Manurangsi

Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:2601-2636, 2026.

Abstract

User-level differentially private (DP) stochastic convex optimization has garnered significant attention due to the paramount importance of safeguarding user privacy in modern large-scale ML applications. Current methods, such as those based on DP stochastic gradient descent (SGD), often struggle with high gradient computation complexity or suboptimal utility due to the need to privatize every intermediate iterate. In this work, we introduce a new nearly linear-time algorithm that resolves this trade-off and achieves the optimal excess rates via an adaptive outlier removal framework. Our key innovation is integrating the sparse vector technique directly into the SGD loop, supported by a novel robust divergence analysis. This approach naturally bounds the sensitivity of gradient estimates without requiring privatization of all intermediate steps. Specifically, our mechanism computes a local concentration score to adaptively filter out users whose updates diverge from the population geometry. Crucially, this approach preserves the unbiasedness of the gradient estimate in well-concentrated regimes while strictly bounding sensitivity in the presence of outliers. We also explore extensions to the $\ell_\infty$ setting demonstrating the generality of our analysis.

Cite this Paper

BibTeX

@InProceedings{pmlr-v336-ghazi26a,
  title = 	 {Nearly Linear-Time User-Level DP-SCO with Optimal Rates},
  author =       {Ghazi, Badih and Kumar, Ravi and Liu, Daogao and Manurangsi, Pasin},
  booktitle = 	 {Proceedings of Thirty Ninth Conference on Learning Theory},
  pages = 	 {2601--2636},
  year = 	 {2026},
  editor = 	 {Hanneke, Steve and Lattimore, Tor},
  volume = 	 {336},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {29 Jun--03 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v336/main/assets/ghazi26a/ghazi26a.pdf},
  url = 	 {https://proceedings.mlr.press/v336/ghazi26a.html},
  abstract = 	 {User-level differentially private (DP) stochastic convex optimization has garnered significant attention due to the paramount importance of safeguarding user privacy in modern large-scale ML applications.  Current methods, such as those based on DP stochastic gradient descent (SGD), often struggle with high gradient computation complexity or suboptimal utility due to the need to privatize every intermediate iterate.  In this work, we introduce a new nearly linear-time algorithm that resolves this trade-off and achieves the optimal excess rates via an adaptive outlier removal framework. Our key innovation is integrating the sparse vector technique directly into the SGD loop, supported by a novel robust divergence analysis. This approach naturally bounds the sensitivity of gradient estimates without requiring privatization of all intermediate steps.  Specifically, our mechanism computes a local concentration score to adaptively filter out users whose updates diverge from the population geometry. Crucially, this  approach preserves the unbiasedness of the gradient estimate in well-concentrated regimes while strictly bounding sensitivity in the presence of outliers. We also explore extensions to the $\ell_\infty$ setting demonstrating the generality of our analysis.}
}

Endnote

%0 Conference Paper
%T Nearly Linear-Time User-Level DP-SCO with Optimal Rates
%A Badih Ghazi
%A Ravi Kumar
%A Daogao Liu
%A Pasin Manurangsi
%B Proceedings of Thirty Ninth Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2026
%E Steve Hanneke
%E Tor Lattimore	
%F pmlr-v336-ghazi26a
%I PMLR
%P 2601--2636
%U https://proceedings.mlr.press/v336/ghazi26a.html
%V 336
%X User-level differentially private (DP) stochastic convex optimization has garnered significant attention due to the paramount importance of safeguarding user privacy in modern large-scale ML applications.  Current methods, such as those based on DP stochastic gradient descent (SGD), often struggle with high gradient computation complexity or suboptimal utility due to the need to privatize every intermediate iterate.  In this work, we introduce a new nearly linear-time algorithm that resolves this trade-off and achieves the optimal excess rates via an adaptive outlier removal framework. Our key innovation is integrating the sparse vector technique directly into the SGD loop, supported by a novel robust divergence analysis. This approach naturally bounds the sensitivity of gradient estimates without requiring privatization of all intermediate steps.  Specifically, our mechanism computes a local concentration score to adaptively filter out users whose updates diverge from the population geometry. Crucially, this  approach preserves the unbiasedness of the gradient estimate in well-concentrated regimes while strictly bounding sensitivity in the presence of outliers. We also explore extensions to the $\ell_\infty$ setting demonstrating the generality of our analysis.

APA

Ghazi, B., Kumar, R., Liu, D. & Manurangsi, P.. (2026). Nearly Linear-Time User-Level DP-SCO with Optimal Rates. Proceedings of Thirty Ninth Conference on Learning Theory, in Proceedings of Machine Learning Research 336:2601-2636 Available from https://proceedings.mlr.press/v336/ghazi26a.html.

Related Material

Download PDF