Faster Approximation Algorithms for k-Center via Data Reduction

Arnold Filtser, Shaofeng H.-C. Jiang, Yi Li, Anurag Murty Naredla, Ioannis Psarros, Qiaoyuan Yang, Qin Zhang
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:17189-17202, 2025.

Abstract

We study efficient algorithms for the Euclidean $k$-Center problem, focusing on the regime of large $k$. We take the approach of data reduction by considering $\alpha$-coreset, which is a small subset $S$ of the dataset $P$ such that any $\beta$-approximation on $S$ is an $(\alpha + \beta)$-approximation on $P$. We give efficient algorithms to construct coresets whose size is $k \cdot o(n)$, which immediately speeds up existing approximation algorithms. Notably, we obtain a near-linear time $O(1)$-approximation when $k = n^c$ for any $0 < c < 1$. We validate the performance of our coresets on real-world datasets with large $k$, and we observe that the coreset speeds up the well-known Gonzalez algorithm by up to $4$ times, while still achieving similar clustering cost. Technically, one of our coreset results is based on a new efficient construction of consistent hashing with competitive parameters. This general tool may be of independent interest for algorithm design in high dimensional Euclidean spaces.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-filtser25a, title = {Faster Approximation Algorithms for k-Center via Data Reduction}, author = {Filtser, Arnold and Jiang, Shaofeng H.-C. and Li, Yi and Naredla, Anurag Murty and Psarros, Ioannis and Yang, Qiaoyuan and Zhang, Qin}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {17189--17202}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/filtser25a/filtser25a.pdf}, url = {https://proceedings.mlr.press/v267/filtser25a.html}, abstract = {We study efficient algorithms for the Euclidean $k$-Center problem, focusing on the regime of large $k$. We take the approach of data reduction by considering $\alpha$-coreset, which is a small subset $S$ of the dataset $P$ such that any $\beta$-approximation on $S$ is an $(\alpha + \beta)$-approximation on $P$. We give efficient algorithms to construct coresets whose size is $k \cdot o(n)$, which immediately speeds up existing approximation algorithms. Notably, we obtain a near-linear time $O(1)$-approximation when $k = n^c$ for any $0 < c < 1$. We validate the performance of our coresets on real-world datasets with large $k$, and we observe that the coreset speeds up the well-known Gonzalez algorithm by up to $4$ times, while still achieving similar clustering cost. Technically, one of our coreset results is based on a new efficient construction of consistent hashing with competitive parameters. This general tool may be of independent interest for algorithm design in high dimensional Euclidean spaces.} }
Endnote
%0 Conference Paper %T Faster Approximation Algorithms for k-Center via Data Reduction %A Arnold Filtser %A Shaofeng H.-C. Jiang %A Yi Li %A Anurag Murty Naredla %A Ioannis Psarros %A Qiaoyuan Yang %A Qin Zhang %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-filtser25a %I PMLR %P 17189--17202 %U https://proceedings.mlr.press/v267/filtser25a.html %V 267 %X We study efficient algorithms for the Euclidean $k$-Center problem, focusing on the regime of large $k$. We take the approach of data reduction by considering $\alpha$-coreset, which is a small subset $S$ of the dataset $P$ such that any $\beta$-approximation on $S$ is an $(\alpha + \beta)$-approximation on $P$. We give efficient algorithms to construct coresets whose size is $k \cdot o(n)$, which immediately speeds up existing approximation algorithms. Notably, we obtain a near-linear time $O(1)$-approximation when $k = n^c$ for any $0 < c < 1$. We validate the performance of our coresets on real-world datasets with large $k$, and we observe that the coreset speeds up the well-known Gonzalez algorithm by up to $4$ times, while still achieving similar clustering cost. Technically, one of our coreset results is based on a new efficient construction of consistent hashing with competitive parameters. This general tool may be of independent interest for algorithm design in high dimensional Euclidean spaces.
APA
Filtser, A., Jiang, S.H., Li, Y., Naredla, A.M., Psarros, I., Yang, Q. & Zhang, Q.. (2025). Faster Approximation Algorithms for k-Center via Data Reduction. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:17189-17202 Available from https://proceedings.mlr.press/v267/filtser25a.html.

Related Material