Better Locally Private Sparse Estimation Given Multiple Samples Per User

Yuheng Ma, Ke Jia, Hanfang Yang
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:33746-33776, 2024.

Abstract

Previous studies yielded discouraging results for item-level locally differentially private linear regression with $s$-sparsity assumption, where the minimax rate for $nm$ samples is $\mathcal{O}(sd / nm\varepsilon^2)$. This can be challenging for high-dimensional data, where the dimension $d$ is extremely large. In this work, we investigate user-level locally differentially private sparse linear regression. We show that with $n$ users each contributing $m$ samples, the linear dependency of dimension $d$ can be eliminated, yielding an error upper bound of $\mathcal{O}(s/ nm\varepsilon^2)$. We propose a framework that first selects candidate variables and then conducts estimation in the narrowed low-dimensional space, which is extendable to general sparse estimation problems with tight error bounds. Experiments on both synthetic and real datasets demonstrate the superiority of the proposed methods. Both the theoretical and empirical results suggest that, with the same number of samples, locally private sparse estimation is better conducted when multiple samples per user are available.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-ma24c, title = {Better Locally Private Sparse Estimation Given Multiple Samples Per User}, author = {Ma, Yuheng and Jia, Ke and Yang, Hanfang}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {33746--33776}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/ma24c/ma24c.pdf}, url = {https://proceedings.mlr.press/v235/ma24c.html}, abstract = {Previous studies yielded discouraging results for item-level locally differentially private linear regression with $s$-sparsity assumption, where the minimax rate for $nm$ samples is $\mathcal{O}(sd / nm\varepsilon^2)$. This can be challenging for high-dimensional data, where the dimension $d$ is extremely large. In this work, we investigate user-level locally differentially private sparse linear regression. We show that with $n$ users each contributing $m$ samples, the linear dependency of dimension $d$ can be eliminated, yielding an error upper bound of $\mathcal{O}(s/ nm\varepsilon^2)$. We propose a framework that first selects candidate variables and then conducts estimation in the narrowed low-dimensional space, which is extendable to general sparse estimation problems with tight error bounds. Experiments on both synthetic and real datasets demonstrate the superiority of the proposed methods. Both the theoretical and empirical results suggest that, with the same number of samples, locally private sparse estimation is better conducted when multiple samples per user are available.} }
Endnote
%0 Conference Paper %T Better Locally Private Sparse Estimation Given Multiple Samples Per User %A Yuheng Ma %A Ke Jia %A Hanfang Yang %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-ma24c %I PMLR %P 33746--33776 %U https://proceedings.mlr.press/v235/ma24c.html %V 235 %X Previous studies yielded discouraging results for item-level locally differentially private linear regression with $s$-sparsity assumption, where the minimax rate for $nm$ samples is $\mathcal{O}(sd / nm\varepsilon^2)$. This can be challenging for high-dimensional data, where the dimension $d$ is extremely large. In this work, we investigate user-level locally differentially private sparse linear regression. We show that with $n$ users each contributing $m$ samples, the linear dependency of dimension $d$ can be eliminated, yielding an error upper bound of $\mathcal{O}(s/ nm\varepsilon^2)$. We propose a framework that first selects candidate variables and then conducts estimation in the narrowed low-dimensional space, which is extendable to general sparse estimation problems with tight error bounds. Experiments on both synthetic and real datasets demonstrate the superiority of the proposed methods. Both the theoretical and empirical results suggest that, with the same number of samples, locally private sparse estimation is better conducted when multiple samples per user are available.
APA
Ma, Y., Jia, K. & Yang, H.. (2024). Better Locally Private Sparse Estimation Given Multiple Samples Per User. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:33746-33776 Available from https://proceedings.mlr.press/v235/ma24c.html.

Related Material