Distribution-Aware Mean Estimation under User-level Local Differential Privacy

Corentin Pla, Maxime Vono, Hugo Richard
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:2089-2097, 2025.

Abstract

We consider the problem of mean estimation under user-level local differential privacy, where $n$ users are contributing through their local pool of data samples. Previous work assume that the number of data samples is the same across users. In contrast, we consider a more general and realistic scenario where each user $u \in [n]$ owns $m_u$ data samples drawn from some generative distribution $\mu$; $m_u$ being unknown to the statistician but drawn from a known distribution $M$ over $\mathbb{N}$. Based on a distribution-aware mean estimation algorithm, we establish an $M$-dependent upper bounds on the worst-case risk over $\mu$ for the task of mean estimation. We then derive a lower bound. The two bounds are asymptotically matching up to logarithmic factors and reduce to known bounds when $m_u = m$ for any user $u$.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-pla25a, title = {Distribution-Aware Mean Estimation under User-level Local Differential Privacy}, author = {Pla, Corentin and Vono, Maxime and Richard, Hugo}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {2089--2097}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/pla25a/pla25a.pdf}, url = {https://proceedings.mlr.press/v258/pla25a.html}, abstract = {We consider the problem of mean estimation under user-level local differential privacy, where $n$ users are contributing through their local pool of data samples. Previous work assume that the number of data samples is the same across users. In contrast, we consider a more general and realistic scenario where each user $u \in [n]$ owns $m_u$ data samples drawn from some generative distribution $\mu$; $m_u$ being unknown to the statistician but drawn from a known distribution $M$ over $\mathbb{N}$. Based on a distribution-aware mean estimation algorithm, we establish an $M$-dependent upper bounds on the worst-case risk over $\mu$ for the task of mean estimation. We then derive a lower bound. The two bounds are asymptotically matching up to logarithmic factors and reduce to known bounds when $m_u = m$ for any user $u$.} }
Endnote
%0 Conference Paper %T Distribution-Aware Mean Estimation under User-level Local Differential Privacy %A Corentin Pla %A Maxime Vono %A Hugo Richard %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-pla25a %I PMLR %P 2089--2097 %U https://proceedings.mlr.press/v258/pla25a.html %V 258 %X We consider the problem of mean estimation under user-level local differential privacy, where $n$ users are contributing through their local pool of data samples. Previous work assume that the number of data samples is the same across users. In contrast, we consider a more general and realistic scenario where each user $u \in [n]$ owns $m_u$ data samples drawn from some generative distribution $\mu$; $m_u$ being unknown to the statistician but drawn from a known distribution $M$ over $\mathbb{N}$. Based on a distribution-aware mean estimation algorithm, we establish an $M$-dependent upper bounds on the worst-case risk over $\mu$ for the task of mean estimation. We then derive a lower bound. The two bounds are asymptotically matching up to logarithmic factors and reduce to known bounds when $m_u = m$ for any user $u$.
APA
Pla, C., Vono, M. & Richard, H.. (2025). Distribution-Aware Mean Estimation under User-level Local Differential Privacy. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:2089-2097 Available from https://proceedings.mlr.press/v258/pla25a.html.

Related Material