Revisiting Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model

Kaito Ariu, Alexandre Proutiere, Se-Young Yun
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:1650-1689, 2025.

Abstract

In this paper, we investigate the problem of recovering hidden communities in the Labeled Stochastic Block Model (LSBM) with a finite number of clusters whose sizes grow linearly with the total number of nodes. We derive the necessary and sufficient conditions under which the expected number of misclassified nodes is less than $ s $, for any number $ s = o(n) $. To achieve this, we propose IAC (Instance-Adaptive Clustering), the first algorithm whose performance matches the instance-specific lower bounds both in expectation and with high probability. IAC is a novel two-phase algorithm that consists of a one-shot spectral clustering step followed by iterative likelihood-based cluster assignment improvements. This approach is based on the instance-specific lower bound and notably does not require any knowledge of the model parameters, including the number of clusters. By performing the spectral clustering only once, IAC maintains an overall computational complexity of $ \mathcal{O}(n\, \text{polylog}(n)) $, making it scalable and practical for large-scale problems.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-ariu25a, title = {Revisiting Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model}, author = {Ariu, Kaito and Proutiere, Alexandre and Yun, Se-Young}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {1650--1689}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/ariu25a/ariu25a.pdf}, url = {https://proceedings.mlr.press/v267/ariu25a.html}, abstract = {In this paper, we investigate the problem of recovering hidden communities in the Labeled Stochastic Block Model (LSBM) with a finite number of clusters whose sizes grow linearly with the total number of nodes. We derive the necessary and sufficient conditions under which the expected number of misclassified nodes is less than $ s $, for any number $ s = o(n) $. To achieve this, we propose IAC (Instance-Adaptive Clustering), the first algorithm whose performance matches the instance-specific lower bounds both in expectation and with high probability. IAC is a novel two-phase algorithm that consists of a one-shot spectral clustering step followed by iterative likelihood-based cluster assignment improvements. This approach is based on the instance-specific lower bound and notably does not require any knowledge of the model parameters, including the number of clusters. By performing the spectral clustering only once, IAC maintains an overall computational complexity of $ \mathcal{O}(n\, \text{polylog}(n)) $, making it scalable and practical for large-scale problems.} }
Endnote
%0 Conference Paper %T Revisiting Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model %A Kaito Ariu %A Alexandre Proutiere %A Se-Young Yun %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-ariu25a %I PMLR %P 1650--1689 %U https://proceedings.mlr.press/v267/ariu25a.html %V 267 %X In this paper, we investigate the problem of recovering hidden communities in the Labeled Stochastic Block Model (LSBM) with a finite number of clusters whose sizes grow linearly with the total number of nodes. We derive the necessary and sufficient conditions under which the expected number of misclassified nodes is less than $ s $, for any number $ s = o(n) $. To achieve this, we propose IAC (Instance-Adaptive Clustering), the first algorithm whose performance matches the instance-specific lower bounds both in expectation and with high probability. IAC is a novel two-phase algorithm that consists of a one-shot spectral clustering step followed by iterative likelihood-based cluster assignment improvements. This approach is based on the instance-specific lower bound and notably does not require any knowledge of the model parameters, including the number of clusters. By performing the spectral clustering only once, IAC maintains an overall computational complexity of $ \mathcal{O}(n\, \text{polylog}(n)) $, making it scalable and practical for large-scale problems.
APA
Ariu, K., Proutiere, A. & Yun, S.. (2025). Revisiting Instance-Optimal Cluster Recovery in the Labeled Stochastic Block Model. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:1650-1689 Available from https://proceedings.mlr.press/v267/ariu25a.html.

Related Material