Better Private Distribution Testing by Leveraging Unverified Auxiliary Data

Maryam Aliakbarpour, Arnav Burudgunte, Clément Canonne, Ronitt Rubinfeld
Proceedings of Thirty Eighth Conference on Learning Theory, PMLR 291:22-63, 2025.

Abstract

We extend the framework of augmented distribution testing (Aliakbarpour, Indyk, Rubinfeld, and Silwal, NeurIPS 2024) to the differentially private setting. This captures scenarios where a data analyst must perform hypothesis testing tasks on sensitive data, but is able to leverage prior knowledge (public, but possibly erroneous or untrusted) about the data distribution. We design private algorithms in this augmented setting for three flagship distribution testing tasks, \emph{uniformity}, \emph{identity}, and \emph{closeness} testing, whose sample complexity smoothly scales with the claimed quality of the auxiliary information. We complement our algorithms with information-theoretic lower bounds, showing that their sample complexity is optimal (up to logarithmic factors).

Cite this Paper


BibTeX
@InProceedings{pmlr-v291-aliakbarpour25a, title = {Better Private Distribution Testing by Leveraging Unverified Auxiliary Data}, author = {Aliakbarpour, Maryam and Burudgunte, Arnav and Canonne, Cl\'ement and Rubinfeld, Ronitt}, booktitle = {Proceedings of Thirty Eighth Conference on Learning Theory}, pages = {22--63}, year = {2025}, editor = {Haghtalab, Nika and Moitra, Ankur}, volume = {291}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--04 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v291/main/assets/aliakbarpour25a/aliakbarpour25a.pdf}, url = {https://proceedings.mlr.press/v291/aliakbarpour25a.html}, abstract = {We extend the framework of augmented distribution testing (Aliakbarpour, Indyk, Rubinfeld, and Silwal, NeurIPS 2024) to the differentially private setting. This captures scenarios where a data analyst must perform hypothesis testing tasks on sensitive data, but is able to leverage prior knowledge (public, but possibly erroneous or untrusted) about the data distribution. We design private algorithms in this augmented setting for three flagship distribution testing tasks, \emph{uniformity}, \emph{identity}, and \emph{closeness} testing, whose sample complexity smoothly scales with the claimed quality of the auxiliary information. We complement our algorithms with information-theoretic lower bounds, showing that their sample complexity is optimal (up to logarithmic factors).} }
Endnote
%0 Conference Paper %T Better Private Distribution Testing by Leveraging Unverified Auxiliary Data %A Maryam Aliakbarpour %A Arnav Burudgunte %A Clément Canonne %A Ronitt Rubinfeld %B Proceedings of Thirty Eighth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2025 %E Nika Haghtalab %E Ankur Moitra %F pmlr-v291-aliakbarpour25a %I PMLR %P 22--63 %U https://proceedings.mlr.press/v291/aliakbarpour25a.html %V 291 %X We extend the framework of augmented distribution testing (Aliakbarpour, Indyk, Rubinfeld, and Silwal, NeurIPS 2024) to the differentially private setting. This captures scenarios where a data analyst must perform hypothesis testing tasks on sensitive data, but is able to leverage prior knowledge (public, but possibly erroneous or untrusted) about the data distribution. We design private algorithms in this augmented setting for three flagship distribution testing tasks, \emph{uniformity}, \emph{identity}, and \emph{closeness} testing, whose sample complexity smoothly scales with the claimed quality of the auxiliary information. We complement our algorithms with information-theoretic lower bounds, showing that their sample complexity is optimal (up to logarithmic factors).
APA
Aliakbarpour, M., Burudgunte, A., Canonne, C. & Rubinfeld, R.. (2025). Better Private Distribution Testing by Leveraging Unverified Auxiliary Data. Proceedings of Thirty Eighth Conference on Learning Theory, in Proceedings of Machine Learning Research 291:22-63 Available from https://proceedings.mlr.press/v291/aliakbarpour25a.html.

Related Material