Differentially-Private Clustering of Easy Instances

Edith Cohen; Haim Kaplan; Yishay Mansour; Uri Stemmer; Eliad Tsfadia

Differentially-Private Clustering of Easy Instances

Edith Cohen, Haim Kaplan, Yishay Mansour, Uri Stemmer, Eliad Tsfadia

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:2049-2059, 2021.

Abstract

Clustering is a fundamental problem in data analysis. In differentially private clustering, the goal is to identify k cluster centers without disclosing information on individual data points. Despite significant research progress, the problem had so far resisted practical solutions. In this work we aim at providing simple implementable differentrially private clustering algorithms when the the data is "easy," e.g., when there exists a significant separation between the clusters. For the easy instances we consider, we have a simple implementation based on utilizing non-private clustering algorithms, and combining them privately. We are able to get improved sample complexity bounds in some cases of Gaussian mixtures and k-means. We complement our theoretical algorithms with experiments of simulated data.

Cite this Paper

BibTeX

@InProceedings{pmlr-v139-cohen21c,
  title = 	 {Differentially-Private Clustering of Easy Instances},
  author =       {Cohen, Edith and Kaplan, Haim and Mansour, Yishay and Stemmer, Uri and Tsfadia, Eliad},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {2049--2059},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/cohen21c/cohen21c.pdf},
  url = 	 {https://proceedings.mlr.press/v139/cohen21c.html},
  abstract = 	 {Clustering is a fundamental problem in data analysis. In differentially private clustering, the goal is to identify k cluster centers without disclosing information on individual data points. Despite significant research progress, the problem had so far resisted practical solutions. In this work we aim at providing simple implementable differentrially private clustering algorithms when the the data is "easy," e.g., when there exists a significant separation between the clusters. For the easy instances we consider, we have a simple implementation based on utilizing non-private clustering algorithms, and combining them privately. We are able to get improved sample complexity bounds in some cases of Gaussian mixtures and k-means. We complement our theoretical algorithms with experiments of simulated data.}
}

Endnote

%0 Conference Paper
%T Differentially-Private Clustering of Easy Instances
%A Edith Cohen
%A Haim Kaplan
%A Yishay Mansour
%A Uri Stemmer
%A Eliad Tsfadia
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-cohen21c
%I PMLR
%P 2049--2059
%U https://proceedings.mlr.press/v139/cohen21c.html
%V 139
%X Clustering is a fundamental problem in data analysis. In differentially private clustering, the goal is to identify k cluster centers without disclosing information on individual data points. Despite significant research progress, the problem had so far resisted practical solutions. In this work we aim at providing simple implementable differentrially private clustering algorithms when the the data is "easy," e.g., when there exists a significant separation between the clusters. For the easy instances we consider, we have a simple implementation based on utilizing non-private clustering algorithms, and combining them privately. We are able to get improved sample complexity bounds in some cases of Gaussian mixtures and k-means. We complement our theoretical algorithms with experiments of simulated data.

APA

Cohen, E., Kaplan, H., Mansour, Y., Stemmer, U. & Tsfadia, E.. (2021). Differentially-Private Clustering of Easy Instances. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:2049-2059 Available from https://proceedings.mlr.press/v139/cohen21c.html.

Differentially-Private Clustering of Easy Instances

Abstract

Cite this Paper

Related Material