Balls-and-Bins Sampling for DP-SGD

Lynn Chua; Badih Ghazi; Charlie Harrison; Pritish Kamath; Ravi Kumar; Ethan Jacob Leeman; Pasin Manurangsi; Amer Sinha; Chiyuan Zhang

Balls-and-Bins Sampling for DP-SGD

Lynn Chua, Badih Ghazi, Charlie Harrison, Pritish Kamath, Ravi Kumar, Ethan Jacob Leeman, Pasin Manurangsi, Amer Sinha, Chiyuan Zhang

Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:946-954, 2025.

Abstract

We introduce the \emph{Balls-and-Bins} sampling for differentially private (DP) optimization methods such as DP-SGD. While it has been common practice to use some form of shuffling in DP-SGD implementations, privacy accounting algorithms have typically assumed that Poisson subsampling is used instead. Recent work by Chua et al. (2024), however, pointed out that shuffling based DP-SGD can have a much larger privacy cost in practical regimes of parameters. In this work we show that the Balls-and-Bins sampling achieves the "best-of-both" samplers, namely, the implementation of Balls-and-Bins sampling is similar to that of Shuffling and models trained using DP-SGD with Balls-and-Bins sampling achieve utility comparable to those trained using DP-SGD with Shuffling at the same noise multiplier, and yet, Balls-and-Bins sampling enjoys similar-or-better privacy amplification as compared to Poisson subsampling in practical regimes.

Cite this Paper

BibTeX

@InProceedings{pmlr-v258-chua25a,
  title = 	 {Balls-and-Bins Sampling for DP-SGD},
  author =       {Chua, Lynn and Ghazi, Badih and Harrison, Charlie and Kamath, Pritish and Kumar, Ravi and Leeman, Ethan Jacob and Manurangsi, Pasin and Sinha, Amer and Zhang, Chiyuan},
  booktitle = 	 {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {946--954},
  year = 	 {2025},
  editor = 	 {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz},
  volume = 	 {258},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {03--05 May},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v258/main/assets/chua25a/chua25a.pdf},
  url = 	 {https://proceedings.mlr.press/v258/chua25a.html},
  abstract = 	 {We introduce the \emph{Balls-and-Bins} sampling for differentially private (DP) optimization methods such as DP-SGD. While it has been common practice to use some form of shuffling in DP-SGD implementations, privacy accounting algorithms have typically assumed that Poisson subsampling is used instead. Recent work by Chua et al. (2024), however, pointed out that shuffling based DP-SGD can have a much larger privacy cost in practical regimes of parameters. In this work we show that the Balls-and-Bins sampling achieves the "best-of-both" samplers, namely, the implementation of Balls-and-Bins sampling is similar to that of Shuffling and models trained using DP-SGD with Balls-and-Bins sampling achieve utility comparable to those trained using DP-SGD with Shuffling at the same noise multiplier, and yet, Balls-and-Bins sampling enjoys similar-or-better privacy amplification as compared to Poisson subsampling in practical regimes.}
}

Endnote

%0 Conference Paper
%T Balls-and-Bins Sampling for DP-SGD
%A Lynn Chua
%A Badih Ghazi
%A Charlie Harrison
%A Pritish Kamath
%A Ravi Kumar
%A Ethan Jacob Leeman
%A Pasin Manurangsi
%A Amer Sinha
%A Chiyuan Zhang
%B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2025
%E Yingzhen Li
%E Stephan Mandt
%E Shipra Agrawal
%E Emtiyaz Khan	
%F pmlr-v258-chua25a
%I PMLR
%P 946--954
%U https://proceedings.mlr.press/v258/chua25a.html
%V 258
%X We introduce the \emph{Balls-and-Bins} sampling for differentially private (DP) optimization methods such as DP-SGD. While it has been common practice to use some form of shuffling in DP-SGD implementations, privacy accounting algorithms have typically assumed that Poisson subsampling is used instead. Recent work by Chua et al. (2024), however, pointed out that shuffling based DP-SGD can have a much larger privacy cost in practical regimes of parameters. In this work we show that the Balls-and-Bins sampling achieves the "best-of-both" samplers, namely, the implementation of Balls-and-Bins sampling is similar to that of Shuffling and models trained using DP-SGD with Balls-and-Bins sampling achieve utility comparable to those trained using DP-SGD with Shuffling at the same noise multiplier, and yet, Balls-and-Bins sampling enjoys similar-or-better privacy amplification as compared to Poisson subsampling in practical regimes.

APA

Chua, L., Ghazi, B., Harrison, C., Kamath, P., Kumar, R., Leeman, E.J., Manurangsi, P., Sinha, A. & Zhang, C.. (2025). Balls-and-Bins Sampling for DP-SGD. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:946-954 Available from https://proceedings.mlr.press/v258/chua25a.html.

Balls-and-Bins Sampling for DP-SGD

Abstract

Cite this Paper

Related Material