Power k-Means Clustering

Jason Xu, Kenneth Lange
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:6921-6931, 2019.

Abstract

Clustering is a fundamental task in unsupervised machine learning. Lloyd’s 1957 algorithm for k-means clustering remains one of the most widely used due to its speed and simplicity, but the greedy approach is sensitive to initialization and often falls short at a poor solution. This paper explores an alternative to Lloyd’s algorithm that retains its simplicity and mitigates its tendency to get trapped by local minima. Called power k-means, our method embeds the k-means problem in a continuous class of similar, better behaved problems with fewer local minima. Power k-means anneals its way toward the solution of ordinary k-means by way of majorization-minimization (MM), sharing the appealing descent property and low complexity of Lloyd’s algorithm. Further, our method complements widely used seeding strategies, reaping marked improvements when used together as demonstrated on a suite of simulated and real data examples.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-xu19a, title = {Power k-Means Clustering}, author = {Xu, Jason and Lange, Kenneth}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {6921--6931}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/xu19a/xu19a.pdf}, url = {https://proceedings.mlr.press/v97/xu19a.html}, abstract = {Clustering is a fundamental task in unsupervised machine learning. Lloyd’s 1957 algorithm for k-means clustering remains one of the most widely used due to its speed and simplicity, but the greedy approach is sensitive to initialization and often falls short at a poor solution. This paper explores an alternative to Lloyd’s algorithm that retains its simplicity and mitigates its tendency to get trapped by local minima. Called power k-means, our method embeds the k-means problem in a continuous class of similar, better behaved problems with fewer local minima. Power k-means anneals its way toward the solution of ordinary k-means by way of majorization-minimization (MM), sharing the appealing descent property and low complexity of Lloyd’s algorithm. Further, our method complements widely used seeding strategies, reaping marked improvements when used together as demonstrated on a suite of simulated and real data examples.} }
Endnote
%0 Conference Paper %T Power k-Means Clustering %A Jason Xu %A Kenneth Lange %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-xu19a %I PMLR %P 6921--6931 %U https://proceedings.mlr.press/v97/xu19a.html %V 97 %X Clustering is a fundamental task in unsupervised machine learning. Lloyd’s 1957 algorithm for k-means clustering remains one of the most widely used due to its speed and simplicity, but the greedy approach is sensitive to initialization and often falls short at a poor solution. This paper explores an alternative to Lloyd’s algorithm that retains its simplicity and mitigates its tendency to get trapped by local minima. Called power k-means, our method embeds the k-means problem in a continuous class of similar, better behaved problems with fewer local minima. Power k-means anneals its way toward the solution of ordinary k-means by way of majorization-minimization (MM), sharing the appealing descent property and low complexity of Lloyd’s algorithm. Further, our method complements widely used seeding strategies, reaping marked improvements when used together as demonstrated on a suite of simulated and real data examples.
APA
Xu, J. & Lange, K.. (2019). Power k-Means Clustering. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:6921-6931 Available from https://proceedings.mlr.press/v97/xu19a.html.

Related Material