Learning Augmented Graph $k$-Clustering

Chenglin Fan, Kijun Shin
Proceedings of Thirty Eighth Conference on Learning Theory, PMLR 291:1919-1934, 2025.

Abstract

Clustering is a fundamental task in unsupervised learning. Previous research has focused on learning-augmented $k$-means in Euclidean metrics, limiting its applicability to complex data representations. In this paper, we generalize learning-augmented $k$-clustering to operate on general metrics, enabling its application to graph-structured and non-Euclidean domains. Our framework also relaxes restrictive cluster size constraints, providing greater flexibility for datasets with imbalanced or unknown cluster distributions. Furthermore, we extend the hardness of query complexity to general metrics: under the Exponential Time Hypothesis (ETH), we show that any polynomial-time algorithm must perform approximately $\Omega(k / \alpha)$ queries to achieve a $(1 + \alpha)$-approximation. These contributions strengthen both the theoretical foundations and practical applicability of learning-augmented clustering, bridging gaps between traditional methods and real-world challenges.

Cite this Paper


BibTeX
@InProceedings{pmlr-v291-fan25a, title = {Learning Augmented Graph $k$-Clustering}, author = {Fan, Chenglin and Shin, Kijun}, booktitle = {Proceedings of Thirty Eighth Conference on Learning Theory}, pages = {1919--1934}, year = {2025}, editor = {Haghtalab, Nika and Moitra, Ankur}, volume = {291}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--04 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v291/main/assets/fan25a/fan25a.pdf}, url = {https://proceedings.mlr.press/v291/fan25a.html}, abstract = { Clustering is a fundamental task in unsupervised learning. Previous research has focused on learning-augmented $k$-means in Euclidean metrics, limiting its applicability to complex data representations. In this paper, we generalize learning-augmented $k$-clustering to operate on general metrics, enabling its application to graph-structured and non-Euclidean domains. Our framework also relaxes restrictive cluster size constraints, providing greater flexibility for datasets with imbalanced or unknown cluster distributions. Furthermore, we extend the hardness of query complexity to general metrics: under the Exponential Time Hypothesis (ETH), we show that any polynomial-time algorithm must perform approximately $\Omega(k / \alpha)$ queries to achieve a $(1 + \alpha)$-approximation. These contributions strengthen both the theoretical foundations and practical applicability of learning-augmented clustering, bridging gaps between traditional methods and real-world challenges.} }
Endnote
%0 Conference Paper %T Learning Augmented Graph $k$-Clustering %A Chenglin Fan %A Kijun Shin %B Proceedings of Thirty Eighth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2025 %E Nika Haghtalab %E Ankur Moitra %F pmlr-v291-fan25a %I PMLR %P 1919--1934 %U https://proceedings.mlr.press/v291/fan25a.html %V 291 %X Clustering is a fundamental task in unsupervised learning. Previous research has focused on learning-augmented $k$-means in Euclidean metrics, limiting its applicability to complex data representations. In this paper, we generalize learning-augmented $k$-clustering to operate on general metrics, enabling its application to graph-structured and non-Euclidean domains. Our framework also relaxes restrictive cluster size constraints, providing greater flexibility for datasets with imbalanced or unknown cluster distributions. Furthermore, we extend the hardness of query complexity to general metrics: under the Exponential Time Hypothesis (ETH), we show that any polynomial-time algorithm must perform approximately $\Omega(k / \alpha)$ queries to achieve a $(1 + \alpha)$-approximation. These contributions strengthen both the theoretical foundations and practical applicability of learning-augmented clustering, bridging gaps between traditional methods and real-world challenges.
APA
Fan, C. & Shin, K.. (2025). Learning Augmented Graph $k$-Clustering. Proceedings of Thirty Eighth Conference on Learning Theory, in Proceedings of Machine Learning Research 291:1919-1934 Available from https://proceedings.mlr.press/v291/fan25a.html.

Related Material