[edit]
A comprehensive benchmark of graph neural networks, graph kernels, and classical machine learning approaches on rs-fMRI brain graphs
Proceedings of The 9th International Conference on Medical Imaging with Deep Learning, PMLR 315:4479-4495, 2026.
Abstract
Resting-state functional MRI (rs-fMRI) provides a powerful lens through which large-scale brain organization can be examined by modeling functional connectivity as a graph. These functional brain graphs now form the basis of machine-learning applications in neuroscience, ranging from relatively straightforward classification problems to more challenging behavioral and cognitive prediction tasks. While graph neural networks (GNNs) have gained increasing attention in neuroimaging, the absence of a unified, reproducible benchmark comparing GNNs with classical machine-learning models and graph kernel methods, across heterogeneous datasets and tasks, has made it difficult to assess their relative strengths. In this work, we introduce a comprehensive benchmarking framework spanning four heterogeneous cohorts ($N = 1513$) and multiple classification tasks, including clinical diagnosis and phenotypic prediction. We systematically evaluate classical models, graph kernels, and representative GNN architectures under a rigorous repeated nested cross-validation design and assess pairwise differences using the corrected repeated k-fold test with false-discovery-rate control. Our results show that, for this class of relatively small graphs with fixed vertex ordering, well-tuned classical ML approaches and graph kernels are competitive with GNNs, while requiring substantially fewer computational resources. For instance, the Shortest-Path graph kernel achieves 0.98 accuracy on the COMA dataset, logistic regression reaches 0.81 accuracy and 0.63 MCC on HCP sex prediction, and all model families cluster closely on multi-site datasets such as ABIDE and ADHD, where no statistically significant differences emerge. All code, seeds, cross-validation folds, fold-specific hyperparameters, full prediction logs and computational-cost measurements are publicly released at to ensure full transparency and reproducibility. This benchmark provides practical guidance for model selection in rs-fMRI connectome analysis.