Split, count, and share: a differentially private set intersection cardinality estimation protocol

Michael Purcell, Yang Li, Kee Siong Ng
Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, PMLR 216:1684-1694, 2023.

Abstract

We describe a simple two-party protocol in which each party contributes a set as input. The output of the protocol is an estimate of the cardinality of the intersection of the two input sets. We show that our protocol is efficient and secure. We show that the space complexity and communication complexity are constant, the time complexity for each party is proportional to the size of their input set, and that our protocol is differentially private. We also analyze the distribution of the output of the protocol, deriving both its asymptotic distribution and finite-sample bounds on its tail probabilities. These analyses show that, when the input sets are large, our protocol produces accurate set intersection cardinality estimates. We claim that our protocol is an attractive alternative to traditional private set intersection cardinality (PSI-CA) protocols when the input sets are large, exact precision is not required, and differential privacy on its own can provide sufficient protection to the underlying sensitive data.

Cite this Paper


BibTeX
@InProceedings{pmlr-v216-purcell23a, title = {Split, count, and share: a differentially private set intersection cardinality estimation protocol}, author = {Purcell, Michael and Li, Yang and Ng, Kee Siong}, booktitle = {Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence}, pages = {1684--1694}, year = {2023}, editor = {Evans, Robin J. and Shpitser, Ilya}, volume = {216}, series = {Proceedings of Machine Learning Research}, month = {31 Jul--04 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v216/purcell23a/purcell23a.pdf}, url = {https://proceedings.mlr.press/v216/purcell23a.html}, abstract = {We describe a simple two-party protocol in which each party contributes a set as input. The output of the protocol is an estimate of the cardinality of the intersection of the two input sets. We show that our protocol is efficient and secure. We show that the space complexity and communication complexity are constant, the time complexity for each party is proportional to the size of their input set, and that our protocol is differentially private. We also analyze the distribution of the output of the protocol, deriving both its asymptotic distribution and finite-sample bounds on its tail probabilities. These analyses show that, when the input sets are large, our protocol produces accurate set intersection cardinality estimates. We claim that our protocol is an attractive alternative to traditional private set intersection cardinality (PSI-CA) protocols when the input sets are large, exact precision is not required, and differential privacy on its own can provide sufficient protection to the underlying sensitive data.} }
Endnote
%0 Conference Paper %T Split, count, and share: a differentially private set intersection cardinality estimation protocol %A Michael Purcell %A Yang Li %A Kee Siong Ng %B Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence %C Proceedings of Machine Learning Research %D 2023 %E Robin J. Evans %E Ilya Shpitser %F pmlr-v216-purcell23a %I PMLR %P 1684--1694 %U https://proceedings.mlr.press/v216/purcell23a.html %V 216 %X We describe a simple two-party protocol in which each party contributes a set as input. The output of the protocol is an estimate of the cardinality of the intersection of the two input sets. We show that our protocol is efficient and secure. We show that the space complexity and communication complexity are constant, the time complexity for each party is proportional to the size of their input set, and that our protocol is differentially private. We also analyze the distribution of the output of the protocol, deriving both its asymptotic distribution and finite-sample bounds on its tail probabilities. These analyses show that, when the input sets are large, our protocol produces accurate set intersection cardinality estimates. We claim that our protocol is an attractive alternative to traditional private set intersection cardinality (PSI-CA) protocols when the input sets are large, exact precision is not required, and differential privacy on its own can provide sufficient protection to the underlying sensitive data.
APA
Purcell, M., Li, Y. & Ng, K.S.. (2023). Split, count, and share: a differentially private set intersection cardinality estimation protocol. Proceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 216:1684-1694 Available from https://proceedings.mlr.press/v216/purcell23a.html.

Related Material