Private k-Means Clustering with Stability Assumptions

Moshe Shechner; Or Sheffet; Uri Stemmer

Private k-Means Clustering with Stability Assumptions

Moshe Shechner, Or Sheffet, Uri Stemmer

Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:2518-2528, 2020.

Abstract

We study the problem of differentially private clustering under input-stability assumptions. Despite the ever-growing volume of works on differential privacy in general and differentially private clustering in particular, only three works (Nissim et al., 2007; Wang et al., 2015; Huang and Liu, 2018) looked at the problem of privately clustering "nice" k-means instances, all three relying on the sample-and-aggregate framework and all three measuring utility in terms of Wasserstein distance between the true cluster centers and the centers returned by the private algorithm. In this work we improve upon this line of works on multiple axes. We present a simpler algorithm for clustering stable inputs (not relying on the sample-and-aggregate framework), and analyze its utility in both the Wasserstein distance and the k-means cost. Moreover, our algorithm has straight-forward analogues for "nice" k-median instances and for the local-model of differential privacy.

Cite this Paper

BibTeX

@InProceedings{pmlr-v108-shechner20a,
  title = 	 {Private k-Means Clustering with Stability Assumptions},
  author =       {Shechner, Moshe and Sheffet, Or and Stemmer, Uri},
  booktitle = 	 {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics},
  pages = 	 {2518--2528},
  year = 	 {2020},
  editor = 	 {Chiappa, Silvia and Calandra, Roberto},
  volume = 	 {108},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {26--28 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v108/shechner20a/shechner20a.pdf},
  url = 	 {https://proceedings.mlr.press/v108/shechner20a.html},
  abstract = 	 {We study the problem of differentially private clustering under input-stability assumptions. Despite the ever-growing volume of works on differential privacy in general and differentially private clustering in particular, only three works (Nissim et al., 2007; Wang et al., 2015; Huang and Liu, 2018) looked at the problem of privately clustering "nice" k-means instances, all three relying on the sample-and-aggregate framework and all three measuring utility in terms of Wasserstein distance between the true cluster centers and the centers returned by the private algorithm. In this work we improve upon this line of works on multiple axes. We present a simpler algorithm for clustering stable inputs (not relying on the sample-and-aggregate framework), and analyze its utility in both the Wasserstein distance and the k-means cost. Moreover, our algorithm has straight-forward analogues for "nice" k-median instances and for the local-model of differential privacy.}
}

Endnote

%0 Conference Paper
%T Private k-Means Clustering with Stability Assumptions
%A Moshe Shechner
%A Or Sheffet
%A Uri Stemmer
%B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2020
%E Silvia Chiappa
%E Roberto Calandra	
%F pmlr-v108-shechner20a
%I PMLR
%P 2518--2528
%U https://proceedings.mlr.press/v108/shechner20a.html
%V 108
%X We study the problem of differentially private clustering under input-stability assumptions. Despite the ever-growing volume of works on differential privacy in general and differentially private clustering in particular, only three works (Nissim et al., 2007; Wang et al., 2015; Huang and Liu, 2018) looked at the problem of privately clustering "nice" k-means instances, all three relying on the sample-and-aggregate framework and all three measuring utility in terms of Wasserstein distance between the true cluster centers and the centers returned by the private algorithm. In this work we improve upon this line of works on multiple axes. We present a simpler algorithm for clustering stable inputs (not relying on the sample-and-aggregate framework), and analyze its utility in both the Wasserstein distance and the k-means cost. Moreover, our algorithm has straight-forward analogues for "nice" k-median instances and for the local-model of differential privacy.

APA

Shechner, M., Sheffet, O. & Stemmer, U.. (2020). Private k-Means Clustering with Stability Assumptions. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:2518-2528 Available from https://proceedings.mlr.press/v108/shechner20a.html.

Related Material

Download PDF