Power k-Means Clustering
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:6921-6931, 2019.
Clustering is a fundamental task in unsupervised machine learning. Lloyd’s 1957 algorithm for k-means clustering remains one of the most widely used due to its speed and simplicity, but the greedy approach is sensitive to initialization and often falls short at a poor solution. This paper explores an alternative to Lloyd’s algorithm that retains its simplicity and mitigates its tendency to get trapped by local minima. Called power k-means, our method embeds the k-means problem in a continuous class of similar, better behaved problems with fewer local minima. Power k-means anneals its way toward the solution of ordinary k-means by way of majorization-minimization (MM), sharing the appealing descent property and low complexity of Lloyd’s algorithm. Further, our method complements widely used seeding strategies, reaping marked improvements when used together as demonstrated on a suite of simulated and real data examples.