SubSearch: Robust Estimation and Outlier Detection for Stochastic Block Models via Subgraph Search

Leonardo Bianco, Christine Keribin, Zacharie Naulet
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:1297-1305, 2025.

Abstract

Community detection is a fundamental task in graph analysis, with methods often relying on fitting models like the Stochastic Block Model (SBM) to observed networks. While many algorithms can accurately estimate SBM parameters when the input graph is a perfect sample from the model, real-world graphs rarely conform to such idealized assumptions. Therefore, robust algorithms are crucial—ones that can recover model parameters even when the data deviates from the assumed distribution. In this work, we propose SubSearch, an algorithm for robustly estimating SBM parameters by exploring the space of subgraphs in search of one that closely aligns with the model’s assumptions. Our approach also functions as an outlier detection method, properly identifying nodes responsible for the graph’s deviation from the model and going beyond simple techniques like pruning high-degree nodes. Extensive experiments on both synthetic and real-world datasets demonstrate the effectiveness of our method.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-bianco25a, title = {SubSearch: Robust Estimation and Outlier Detection for Stochastic Block Models via Subgraph Search}, author = {Bianco, Leonardo and Keribin, Christine and Naulet, Zacharie}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {1297--1305}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/bianco25a/bianco25a.pdf}, url = {https://proceedings.mlr.press/v258/bianco25a.html}, abstract = {Community detection is a fundamental task in graph analysis, with methods often relying on fitting models like the Stochastic Block Model (SBM) to observed networks. While many algorithms can accurately estimate SBM parameters when the input graph is a perfect sample from the model, real-world graphs rarely conform to such idealized assumptions. Therefore, robust algorithms are crucial—ones that can recover model parameters even when the data deviates from the assumed distribution. In this work, we propose SubSearch, an algorithm for robustly estimating SBM parameters by exploring the space of subgraphs in search of one that closely aligns with the model’s assumptions. Our approach also functions as an outlier detection method, properly identifying nodes responsible for the graph’s deviation from the model and going beyond simple techniques like pruning high-degree nodes. Extensive experiments on both synthetic and real-world datasets demonstrate the effectiveness of our method.} }
Endnote
%0 Conference Paper %T SubSearch: Robust Estimation and Outlier Detection for Stochastic Block Models via Subgraph Search %A Leonardo Bianco %A Christine Keribin %A Zacharie Naulet %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-bianco25a %I PMLR %P 1297--1305 %U https://proceedings.mlr.press/v258/bianco25a.html %V 258 %X Community detection is a fundamental task in graph analysis, with methods often relying on fitting models like the Stochastic Block Model (SBM) to observed networks. While many algorithms can accurately estimate SBM parameters when the input graph is a perfect sample from the model, real-world graphs rarely conform to such idealized assumptions. Therefore, robust algorithms are crucial—ones that can recover model parameters even when the data deviates from the assumed distribution. In this work, we propose SubSearch, an algorithm for robustly estimating SBM parameters by exploring the space of subgraphs in search of one that closely aligns with the model’s assumptions. Our approach also functions as an outlier detection method, properly identifying nodes responsible for the graph’s deviation from the model and going beyond simple techniques like pruning high-degree nodes. Extensive experiments on both synthetic and real-world datasets demonstrate the effectiveness of our method.
APA
Bianco, L., Keribin, C. & Naulet, Z.. (2025). SubSearch: Robust Estimation and Outlier Detection for Stochastic Block Models via Subgraph Search. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:1297-1305 Available from https://proceedings.mlr.press/v258/bianco25a.html.

Related Material