An Efficient Semismooth Newton based Algorithm for Convex Clustering

Yancheng Yuan, Defeng Sun, Kim-Chuan Toh
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:5718-5726, 2018.

Abstract

Clustering is a fundamental problem in unsupervised learning. Popular methods like K-means, may suffer from instability as they are prone to get stuck in its local minima. Recently, the sumof-norms (SON) model (also known as clustering path), which is a convex relaxation of hierarchical clustering model, has been proposed in (Lindsten et al., 2011) and (Hocking et al., 2011). Although numerical algorithms like alternating direction method of multipliers (ADMM) and alternating minimization algorithm (AMA) have been proposed to solve convex clustering model (Chi & Lange, 2015), it is known to be very challenging to solve large-scale problems. In this paper, we propose a semismooth Newton based augmented Lagrangian method for large-scale convex clustering problems. Extensive numerical experiments on both simulated and real data demonstrate that our algorithm is highly efficient and robust for solving large-scale problems. Moreover, the numerical results also show the superior performance and scalability of our algorithm comparing to existing first-order methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-yuan18a, title = {An Efficient Semismooth {N}ewton based Algorithm for Convex Clustering}, author = {Yuan, Yancheng and Sun, Defeng and Toh, Kim-Chuan}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {5718--5726}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/yuan18a/yuan18a.pdf}, url = {https://proceedings.mlr.press/v80/yuan18a.html}, abstract = {Clustering is a fundamental problem in unsupervised learning. Popular methods like K-means, may suffer from instability as they are prone to get stuck in its local minima. Recently, the sumof-norms (SON) model (also known as clustering path), which is a convex relaxation of hierarchical clustering model, has been proposed in (Lindsten et al., 2011) and (Hocking et al., 2011). Although numerical algorithms like alternating direction method of multipliers (ADMM) and alternating minimization algorithm (AMA) have been proposed to solve convex clustering model (Chi & Lange, 2015), it is known to be very challenging to solve large-scale problems. In this paper, we propose a semismooth Newton based augmented Lagrangian method for large-scale convex clustering problems. Extensive numerical experiments on both simulated and real data demonstrate that our algorithm is highly efficient and robust for solving large-scale problems. Moreover, the numerical results also show the superior performance and scalability of our algorithm comparing to existing first-order methods.} }
Endnote
%0 Conference Paper %T An Efficient Semismooth Newton based Algorithm for Convex Clustering %A Yancheng Yuan %A Defeng Sun %A Kim-Chuan Toh %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-yuan18a %I PMLR %P 5718--5726 %U https://proceedings.mlr.press/v80/yuan18a.html %V 80 %X Clustering is a fundamental problem in unsupervised learning. Popular methods like K-means, may suffer from instability as they are prone to get stuck in its local minima. Recently, the sumof-norms (SON) model (also known as clustering path), which is a convex relaxation of hierarchical clustering model, has been proposed in (Lindsten et al., 2011) and (Hocking et al., 2011). Although numerical algorithms like alternating direction method of multipliers (ADMM) and alternating minimization algorithm (AMA) have been proposed to solve convex clustering model (Chi & Lange, 2015), it is known to be very challenging to solve large-scale problems. In this paper, we propose a semismooth Newton based augmented Lagrangian method for large-scale convex clustering problems. Extensive numerical experiments on both simulated and real data demonstrate that our algorithm is highly efficient and robust for solving large-scale problems. Moreover, the numerical results also show the superior performance and scalability of our algorithm comparing to existing first-order methods.
APA
Yuan, Y., Sun, D. & Toh, K.. (2018). An Efficient Semismooth Newton based Algorithm for Convex Clustering. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:5718-5726 Available from https://proceedings.mlr.press/v80/yuan18a.html.

Related Material