[edit]
Differential Privacy in Distributed Learning: Beyond Uniformly Bounded Stochastic Gradients
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:3223-3231, 2025.
Abstract
This paper explores locally differentially private distributed algorithms that solve non-convex empirical risk minimization problems. Traditional approaches often assume uniformly bounded stochastic gradients, which may not hold in practice. To address this issue, we propose differentially \textbf{Pri}vate \textbf{S}tochastic recursive \textbf{M}omentum with gr\textbf{A}dient clipping (PriSMA) that judiciously integrates clipping and momentum to enhance utility while guaranteeing privacy. Without assuming uniformly bounded stochastic gradients, given privacy requirement $(\epsilon,\delta)$, PriSMA achieves a learning error of $\tilde{\mathcal{O}}\big((\frac{\sqrt{d}}{\sqrt{M}N\epsilon})^\frac{2}{5}\big)$, where $M$ is the number of clients, $N$ is the number of data samples on each client and $d$ is the model dimension. This learning error bound is better than the state-of-the-art $\tilde{\mathcal{O}}\big((\frac{\sqrt{d}}{{\sqrt{M}N\epsilon}})^\frac{1}{3}\big)$ in terms of the dependence on $M$ and $N$.