Differentially Private Community Detection for Stochastic Block Models

Mohamed S Mohamed; Dung Nguyen; Anil Vullikanti; Ravi Tandon

Differentially Private Community Detection for Stochastic Block Models

Mohamed S Mohamed, Dung Nguyen, Anil Vullikanti, Ravi Tandon

Proceedings of the 39th International Conference on Machine Learning, PMLR 162:15858-15894, 2022.

Abstract

The goal of community detection over graphs is to recover underlying labels/attributes of users (e.g., political affiliation) given the connectivity between users. There has been significant recent progress on understanding the fundamental limits of community detection when the graph is generated from a stochastic block model (SBM). Specifically, sharp information theoretic limits and efficient algorithms have been obtained for SBMs as a function of

$p$ and

$q$ , which represent the intra-community and inter-community connection probabilities. In this paper, we study the community detection problem while preserving the privacy of the individual connections between the vertices. Focusing on the notion of

$(\epsilon, \delta)$ -edge differential privacy (DP), we seek to understand the fundamental tradeoffs between

$(p, q)$ , DP budget

$(\epsilon, \delta)$ , and computational efficiency for exact recovery of community labels. To this end, we present and analyze the associated information-theoretic tradeoffs for three differentially private community recovery mechanisms: a) stability based mechanism; b) sampling based mechanisms; and c) graph perturbation mechanisms. Our main findings are that stability and sampling based mechanisms lead to a superior tradeoff between

$(p,q)$ and the privacy budget

$(\epsilon, \delta)$ ; however this comes at the expense of higher computational complexity. On the other hand, albeit low complexity, graph perturbation mechanisms require the privacy budget

$\epsilon$ to scale as

$\Omega(\log(n))$ for exact recovery.

Cite this Paper

BibTeX


@InProceedings{pmlr-v162-mohamed22a,
  title = 	 {Differentially Private Community Detection for Stochastic Block Models},
  author =       {Mohamed, Mohamed S and Nguyen, Dung and Vullikanti, Anil and Tandon, Ravi},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {15858--15894},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/mohamed22a/mohamed22a.pdf},
  url = 	 {https://proceedings.mlr.press/v162/mohamed22a.html},
  abstract = 	 {The goal of community detection over graphs is to recover underlying labels/attributes of users (e.g., political affiliation) given the connectivity between users. There has been significant recent progress on understanding the fundamental limits of community detection when the graph is generated from a stochastic block model (SBM). Specifically, sharp information theoretic limits and efficient algorithms have been obtained for SBMs as a function of $p$ and $q$, which represent the intra-community and inter-community connection probabilities. In this paper, we study the community detection problem while preserving the privacy of the individual connections between the vertices. Focusing on the notion of $(\epsilon, \delta)$-edge differential privacy (DP), we seek to understand the fundamental tradeoffs between $(p, q)$, DP budget $(\epsilon, \delta)$, and computational efficiency for exact recovery of community labels. To this end, we present and analyze the associated information-theoretic tradeoffs for three differentially private community recovery mechanisms: a) stability based mechanism; b) sampling based mechanisms; and c) graph perturbation mechanisms. Our main findings are that stability and sampling based mechanisms lead to a superior tradeoff between $(p,q)$ and the privacy budget $(\epsilon, \delta)$; however this comes at the expense of higher computational complexity. On the other hand, albeit low complexity, graph perturbation mechanisms require the privacy budget $\epsilon$ to scale as $\Omega(\log(n))$ for exact recovery.}
}

Endnote

%0 Conference Paper
%T Differentially Private Community Detection for Stochastic Block Models
%A Mohamed S Mohamed
%A Dung Nguyen
%A Anil Vullikanti
%A Ravi Tandon
%B Proceedings of the 39th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Kamalika Chaudhuri
%E Stefanie Jegelka
%E Le Song
%E Csaba Szepesvari
%E Gang Niu
%E Sivan Sabato	
%F pmlr-v162-mohamed22a
%I PMLR
%P 15858--15894
%U https://proceedings.mlr.press/v162/mohamed22a.html
%V 162
%X The goal of community detection over graphs is to recover underlying labels/attributes of users (e.g., political affiliation) given the connectivity between users. There has been significant recent progress on understanding the fundamental limits of community detection when the graph is generated from a stochastic block model (SBM). Specifically, sharp information theoretic limits and efficient algorithms have been obtained for SBMs as a function of $p$ and $q$, which represent the intra-community and inter-community connection probabilities. In this paper, we study the community detection problem while preserving the privacy of the individual connections between the vertices. Focusing on the notion of $(\epsilon, \delta)$-edge differential privacy (DP), we seek to understand the fundamental tradeoffs between $(p, q)$, DP budget $(\epsilon, \delta)$, and computational efficiency for exact recovery of community labels. To this end, we present and analyze the associated information-theoretic tradeoffs for three differentially private community recovery mechanisms: a) stability based mechanism; b) sampling based mechanisms; and c) graph perturbation mechanisms. Our main findings are that stability and sampling based mechanisms lead to a superior tradeoff between $(p,q)$ and the privacy budget $(\epsilon, \delta)$; however this comes at the expense of higher computational complexity. On the other hand, albeit low complexity, graph perturbation mechanisms require the privacy budget $\epsilon$ to scale as $\Omega(\log(n))$ for exact recovery.

APA


Mohamed, M.S., Nguyen, D., Vullikanti, A. & Tandon, R.. (2022). Differentially Private Community Detection for Stochastic Block Models. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:15858-15894 Available from https://proceedings.mlr.press/v162/mohamed22a.html.

Related Material

Download PDF