[edit]
Hierarchical Overlapping Clustering on Graphs: Cost Function, Algorithm and Scalability
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:47565-47589, 2025.
Abstract
Hierarchical and overlapping clustering are two prevalent phenomena that often coexist in real-world system. While numerous studies have examined these two structures separately, characterizing and evaluating their hybrid forms remains an open challenge. To bridge this gap, we initiate the study of hierarchical overlapping clustering on graphs by introducing a new cost function and establishing its rationality through several intuitive properties. We further develop an approximation algorithm that achieves a constant approximation factor for its dual version. Our approach employs a recursive overlapping bipartition framework based on local search, enabling a highly scalable speed-up variant. Experimental results demonstrate that this speed-up algorithm outperforms all baseline methods significantly in both effectiveness (across synthetic and real datasets) and scalability.