Differentially private multi-party data release for linear regression

Ruihan Wu, Xin Yang, Yuanshun Yao, Jiankai Sun, Tianyi Liu, Q. Kilian Weinberger, Chong Wang
Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, PMLR 180:2128-2137, 2022.

Abstract

Differentially Private (DP) data release is a promising technique to disseminate data without compromising the privacy of data subjects. However the majority of prior work has focused on scenarios where a single party owns all the data. In this paper we focus on the multi-party setting, where different stakeholders own disjoint sets of attributes belonging to the same group of data subjects. Within the context of linear regression that allow all parties to train models on the complete data without the ability to infer private attributes or identities of individuals, we start with directly applying Gaussian mechanism and show it has the small eigenvalue problem. We further propose our novel method and prove it asymptotically converges to the optimal (non-private) solutions with increasing dataset size. We substantiate the theoretical results through experiments on both artificial and real-world datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v180-wu22b, title = {Differentially private multi-party data release for linear regression}, author = {Wu, Ruihan and Yang, Xin and Yao, Yuanshun and Sun, Jiankai and Liu, Tianyi and Weinberger, Q. Kilian and Wang, Chong}, booktitle = {Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence}, pages = {2128--2137}, year = {2022}, editor = {Cussens, James and Zhang, Kun}, volume = {180}, series = {Proceedings of Machine Learning Research}, month = {01--05 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v180/wu22b/wu22b.pdf}, url = {https://proceedings.mlr.press/v180/wu22b.html}, abstract = {Differentially Private (DP) data release is a promising technique to disseminate data without compromising the privacy of data subjects. However the majority of prior work has focused on scenarios where a single party owns all the data. In this paper we focus on the multi-party setting, where different stakeholders own disjoint sets of attributes belonging to the same group of data subjects. Within the context of linear regression that allow all parties to train models on the complete data without the ability to infer private attributes or identities of individuals, we start with directly applying Gaussian mechanism and show it has the small eigenvalue problem. We further propose our novel method and prove it asymptotically converges to the optimal (non-private) solutions with increasing dataset size. We substantiate the theoretical results through experiments on both artificial and real-world datasets.} }
Endnote
%0 Conference Paper %T Differentially private multi-party data release for linear regression %A Ruihan Wu %A Xin Yang %A Yuanshun Yao %A Jiankai Sun %A Tianyi Liu %A Q. Kilian Weinberger %A Chong Wang %B Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2022 %E James Cussens %E Kun Zhang %F pmlr-v180-wu22b %I PMLR %P 2128--2137 %U https://proceedings.mlr.press/v180/wu22b.html %V 180 %X Differentially Private (DP) data release is a promising technique to disseminate data without compromising the privacy of data subjects. However the majority of prior work has focused on scenarios where a single party owns all the data. In this paper we focus on the multi-party setting, where different stakeholders own disjoint sets of attributes belonging to the same group of data subjects. Within the context of linear regression that allow all parties to train models on the complete data without the ability to infer private attributes or identities of individuals, we start with directly applying Gaussian mechanism and show it has the small eigenvalue problem. We further propose our novel method and prove it asymptotically converges to the optimal (non-private) solutions with increasing dataset size. We substantiate the theoretical results through experiments on both artificial and real-world datasets.
APA
Wu, R., Yang, X., Yao, Y., Sun, J., Liu, T., Weinberger, Q.K. & Wang, C.. (2022). Differentially private multi-party data release for linear regression. Proceedings of the Thirty-Eighth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 180:2128-2137 Available from https://proceedings.mlr.press/v180/wu22b.html.

Related Material