Power kMeans Clustering
[edit]
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:69216931, 2019.
Abstract
Clustering is a fundamental task in unsupervised machine learning. Lloyd’s 1957 algorithm for kmeans clustering remains one of the most widely used due to its speed and simplicity, but the greedy approach is sensitive to initialization and often falls short at a poor solution. This paper explores an alternative to Lloyd’s algorithm that retains its simplicity and mitigates its tendency to get trapped by local minima. Called power kmeans, our method embeds the kmeans problem in a continuous class of similar, better behaved problems with fewer local minima. Power kmeans anneals its way toward the solution of ordinary kmeans by way of majorizationminimization (MM), sharing the appealing descent property and low complexity of Lloyd’s algorithm. Further, our method complements widely used seeding strategies, reaping marked improvements when used together as demonstrated on a suite of simulated and real data examples.
Related Material


