Convergence of online k-means

Geelon So, Gaurav Mahajan, Sanjoy Dasgupta
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:8534-8569, 2022.

Abstract

We prove asymptotic convergence for a general class of k-means algorithms performed over streaming data from a distribution–the centers asymptotically converge to the set of stationary points of the k-means objective function. To do so, we show that online k-means over a distribution can be interpreted as stochastic gradient descent with a stochastic learning rate schedule. Then, we prove convergence by extending techniques used in optimization literature to handle settings where center-specific learning rates may depend on the past trajectory of the centers.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-so22a, title = { Convergence of online k-means }, author = {So, Geelon and Mahajan, Gaurav and Dasgupta, Sanjoy}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {8534--8569}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/so22a/so22a.pdf}, url = {https://proceedings.mlr.press/v151/so22a.html}, abstract = { We prove asymptotic convergence for a general class of k-means algorithms performed over streaming data from a distribution–the centers asymptotically converge to the set of stationary points of the k-means objective function. To do so, we show that online k-means over a distribution can be interpreted as stochastic gradient descent with a stochastic learning rate schedule. Then, we prove convergence by extending techniques used in optimization literature to handle settings where center-specific learning rates may depend on the past trajectory of the centers. } }
Endnote
%0 Conference Paper %T Convergence of online k-means %A Geelon So %A Gaurav Mahajan %A Sanjoy Dasgupta %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-so22a %I PMLR %P 8534--8569 %U https://proceedings.mlr.press/v151/so22a.html %V 151 %X We prove asymptotic convergence for a general class of k-means algorithms performed over streaming data from a distribution–the centers asymptotically converge to the set of stationary points of the k-means objective function. To do so, we show that online k-means over a distribution can be interpreted as stochastic gradient descent with a stochastic learning rate schedule. Then, we prove convergence by extending techniques used in optimization literature to handle settings where center-specific learning rates may depend on the past trajectory of the centers.
APA
So, G., Mahajan, G. & Dasgupta, S.. (2022). Convergence of online k-means . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:8534-8569 Available from https://proceedings.mlr.press/v151/so22a.html.

Related Material