Global Optimization of K-Center Clustering

Mingfei Shi, Kaixun Hua, Jiayang Ren, Yankai Cao
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:19956-19966, 2022.

Abstract

$k$-center problem is a well-known clustering method and can be formulated as a mixed-integer nonlinear programming problem. This work provides a practical global optimization algorithm for this task based on a reduced-space spatial branch and bound scheme. This algorithm can guarantee convergence to the global optimum by only branching on the centers of clusters, which is independent of the dataset’s cardinality. In addition, a set of feasibility-based bounds tightening techniques are proposed to narrow down the domain of centers and significantly accelerate the convergence. To demonstrate the capacity of this algorithm, we present computational results on 32 datasets. Notably, for the dataset with 14 million samples and 3 features, the serial implementation of the algorithm can converge to an optimality gap of 0.1% within 2 hours. Compared with a heuristic method, the global optimum obtained by our algorithm can reduce the objective function on average by 30.4%.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-shi22b, title = {Global Optimization of K-Center Clustering}, author = {Shi, Mingfei and Hua, Kaixun and Ren, Jiayang and Cao, Yankai}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {19956--19966}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/shi22b/shi22b.pdf}, url = {https://proceedings.mlr.press/v162/shi22b.html}, abstract = {$k$-center problem is a well-known clustering method and can be formulated as a mixed-integer nonlinear programming problem. This work provides a practical global optimization algorithm for this task based on a reduced-space spatial branch and bound scheme. This algorithm can guarantee convergence to the global optimum by only branching on the centers of clusters, which is independent of the dataset’s cardinality. In addition, a set of feasibility-based bounds tightening techniques are proposed to narrow down the domain of centers and significantly accelerate the convergence. To demonstrate the capacity of this algorithm, we present computational results on 32 datasets. Notably, for the dataset with 14 million samples and 3 features, the serial implementation of the algorithm can converge to an optimality gap of 0.1% within 2 hours. Compared with a heuristic method, the global optimum obtained by our algorithm can reduce the objective function on average by 30.4%.} }
Endnote
%0 Conference Paper %T Global Optimization of K-Center Clustering %A Mingfei Shi %A Kaixun Hua %A Jiayang Ren %A Yankai Cao %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-shi22b %I PMLR %P 19956--19966 %U https://proceedings.mlr.press/v162/shi22b.html %V 162 %X $k$-center problem is a well-known clustering method and can be formulated as a mixed-integer nonlinear programming problem. This work provides a practical global optimization algorithm for this task based on a reduced-space spatial branch and bound scheme. This algorithm can guarantee convergence to the global optimum by only branching on the centers of clusters, which is independent of the dataset’s cardinality. In addition, a set of feasibility-based bounds tightening techniques are proposed to narrow down the domain of centers and significantly accelerate the convergence. To demonstrate the capacity of this algorithm, we present computational results on 32 datasets. Notably, for the dataset with 14 million samples and 3 features, the serial implementation of the algorithm can converge to an optimality gap of 0.1% within 2 hours. Compared with a heuristic method, the global optimum obtained by our algorithm can reduce the objective function on average by 30.4%.
APA
Shi, M., Hua, K., Ren, J. & Cao, Y.. (2022). Global Optimization of K-Center Clustering. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:19956-19966 Available from https://proceedings.mlr.press/v162/shi22b.html.

Related Material