Differential Privacy in Distributed Learning: Beyond Uniformly Bounded Stochastic Gradients

Yue Huang, Jiaojiao Zhang, Qing Ling
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:3223-3231, 2025.

Abstract

This paper explores locally differentially private distributed algorithms that solve non-convex empirical risk minimization problems. Traditional approaches often assume uniformly bounded stochastic gradients, which may not hold in practice. To address this issue, we propose differentially \textbf{Pri}vate \textbf{S}tochastic recursive \textbf{M}omentum with gr\textbf{A}dient clipping (PriSMA) that judiciously integrates clipping and momentum to enhance utility while guaranteeing privacy. Without assuming uniformly bounded stochastic gradients, given privacy requirement $(\epsilon,\delta)$, PriSMA achieves a learning error of $\tilde{\mathcal{O}}\big((\frac{\sqrt{d}}{\sqrt{M}N\epsilon})^\frac{2}{5}\big)$, where $M$ is the number of clients, $N$ is the number of data samples on each client and $d$ is the model dimension. This learning error bound is better than the state-of-the-art $\tilde{\mathcal{O}}\big((\frac{\sqrt{d}}{{\sqrt{M}N\epsilon}})^\frac{1}{3}\big)$ in terms of the dependence on $M$ and $N$.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-huang25b, title = {Differential Privacy in Distributed Learning: Beyond Uniformly Bounded Stochastic Gradients}, author = {Huang, Yue and Zhang, Jiaojiao and Ling, Qing}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {3223--3231}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/huang25b/huang25b.pdf}, url = {https://proceedings.mlr.press/v258/huang25b.html}, abstract = {This paper explores locally differentially private distributed algorithms that solve non-convex empirical risk minimization problems. Traditional approaches often assume uniformly bounded stochastic gradients, which may not hold in practice. To address this issue, we propose differentially \textbf{Pri}vate \textbf{S}tochastic recursive \textbf{M}omentum with gr\textbf{A}dient clipping (PriSMA) that judiciously integrates clipping and momentum to enhance utility while guaranteeing privacy. Without assuming uniformly bounded stochastic gradients, given privacy requirement $(\epsilon,\delta)$, PriSMA achieves a learning error of $\tilde{\mathcal{O}}\big((\frac{\sqrt{d}}{\sqrt{M}N\epsilon})^\frac{2}{5}\big)$, where $M$ is the number of clients, $N$ is the number of data samples on each client and $d$ is the model dimension. This learning error bound is better than the state-of-the-art $\tilde{\mathcal{O}}\big((\frac{\sqrt{d}}{{\sqrt{M}N\epsilon}})^\frac{1}{3}\big)$ in terms of the dependence on $M$ and $N$.} }
Endnote
%0 Conference Paper %T Differential Privacy in Distributed Learning: Beyond Uniformly Bounded Stochastic Gradients %A Yue Huang %A Jiaojiao Zhang %A Qing Ling %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-huang25b %I PMLR %P 3223--3231 %U https://proceedings.mlr.press/v258/huang25b.html %V 258 %X This paper explores locally differentially private distributed algorithms that solve non-convex empirical risk minimization problems. Traditional approaches often assume uniformly bounded stochastic gradients, which may not hold in practice. To address this issue, we propose differentially \textbf{Pri}vate \textbf{S}tochastic recursive \textbf{M}omentum with gr\textbf{A}dient clipping (PriSMA) that judiciously integrates clipping and momentum to enhance utility while guaranteeing privacy. Without assuming uniformly bounded stochastic gradients, given privacy requirement $(\epsilon,\delta)$, PriSMA achieves a learning error of $\tilde{\mathcal{O}}\big((\frac{\sqrt{d}}{\sqrt{M}N\epsilon})^\frac{2}{5}\big)$, where $M$ is the number of clients, $N$ is the number of data samples on each client and $d$ is the model dimension. This learning error bound is better than the state-of-the-art $\tilde{\mathcal{O}}\big((\frac{\sqrt{d}}{{\sqrt{M}N\epsilon}})^\frac{1}{3}\big)$ in terms of the dependence on $M$ and $N$.
APA
Huang, Y., Zhang, J. & Ling, Q.. (2025). Differential Privacy in Distributed Learning: Beyond Uniformly Bounded Stochastic Gradients. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:3223-3231 Available from https://proceedings.mlr.press/v258/huang25b.html.

Related Material