Large Scale K-Median Clustering for Stable Clustering Instances

Konstantin Voevodski
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:2890-2898, 2021.

Abstract

We study the problem of computing a good k-median clustering in a parallel computing environment. We design an efficient algorithm that gives a constant-factor approximation to the optimal solution for stable clustering instances. The notion of stability that we consider is resilience to perturbations of the distances between the points. Our computational experiments show that our algorithm works well in practice - we are able to find better clusterings than Lloyd’s algorithm and a centralized coreset construction using samples of the same size.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-voevodski21a, title = { Large Scale K-Median Clustering for Stable Clustering Instances }, author = {Voevodski, Konstantin}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {2890--2898}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/voevodski21a/voevodski21a.pdf}, url = {https://proceedings.mlr.press/v130/voevodski21a.html}, abstract = { We study the problem of computing a good k-median clustering in a parallel computing environment. We design an efficient algorithm that gives a constant-factor approximation to the optimal solution for stable clustering instances. The notion of stability that we consider is resilience to perturbations of the distances between the points. Our computational experiments show that our algorithm works well in practice - we are able to find better clusterings than Lloyd’s algorithm and a centralized coreset construction using samples of the same size. } }
Endnote
%0 Conference Paper %T Large Scale K-Median Clustering for Stable Clustering Instances %A Konstantin Voevodski %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-voevodski21a %I PMLR %P 2890--2898 %U https://proceedings.mlr.press/v130/voevodski21a.html %V 130 %X We study the problem of computing a good k-median clustering in a parallel computing environment. We design an efficient algorithm that gives a constant-factor approximation to the optimal solution for stable clustering instances. The notion of stability that we consider is resilience to perturbations of the distances between the points. Our computational experiments show that our algorithm works well in practice - we are able to find better clusterings than Lloyd’s algorithm and a centralized coreset construction using samples of the same size.
APA
Voevodski, K.. (2021). Large Scale K-Median Clustering for Stable Clustering Instances . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:2890-2898 Available from https://proceedings.mlr.press/v130/voevodski21a.html.

Related Material