Sample Complexity of Correlation Detection in the Gaussian Wigner Model

Dong Huang, Pengkun Yang
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:26020-26040, 2025.

Abstract

Correlation analysis is a fundamental step in uncovering meaningful insights from complex datasets. In this paper, we study the problem of detecting correlations between two random graphs following the Gaussian Wigner model with unlabeled vertices. Specifically, the task is formulated as a hypothesis testing problem: under the null hypothesis, the two graphs are independent, while under the alternative hypothesis, they are edge-correlated through a latent vertex permutation, yet maintain the same marginal distributions as under the null. We focus on the scenario where two induced subgraphs, each with a fixed number of vertices, are sampled. We determine the optimal rate for the sample size required for correlation detection, derived through an analysis of the conditional second moment. Additionally, we propose an efficient approximate algorithm that significantly reduces running time.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-huang25aq, title = {Sample Complexity of Correlation Detection in the {G}aussian Wigner Model}, author = {Huang, Dong and Yang, Pengkun}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {26020--26040}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/huang25aq/huang25aq.pdf}, url = {https://proceedings.mlr.press/v267/huang25aq.html}, abstract = {Correlation analysis is a fundamental step in uncovering meaningful insights from complex datasets. In this paper, we study the problem of detecting correlations between two random graphs following the Gaussian Wigner model with unlabeled vertices. Specifically, the task is formulated as a hypothesis testing problem: under the null hypothesis, the two graphs are independent, while under the alternative hypothesis, they are edge-correlated through a latent vertex permutation, yet maintain the same marginal distributions as under the null. We focus on the scenario where two induced subgraphs, each with a fixed number of vertices, are sampled. We determine the optimal rate for the sample size required for correlation detection, derived through an analysis of the conditional second moment. Additionally, we propose an efficient approximate algorithm that significantly reduces running time.} }
Endnote
%0 Conference Paper %T Sample Complexity of Correlation Detection in the Gaussian Wigner Model %A Dong Huang %A Pengkun Yang %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-huang25aq %I PMLR %P 26020--26040 %U https://proceedings.mlr.press/v267/huang25aq.html %V 267 %X Correlation analysis is a fundamental step in uncovering meaningful insights from complex datasets. In this paper, we study the problem of detecting correlations between two random graphs following the Gaussian Wigner model with unlabeled vertices. Specifically, the task is formulated as a hypothesis testing problem: under the null hypothesis, the two graphs are independent, while under the alternative hypothesis, they are edge-correlated through a latent vertex permutation, yet maintain the same marginal distributions as under the null. We focus on the scenario where two induced subgraphs, each with a fixed number of vertices, are sampled. We determine the optimal rate for the sample size required for correlation detection, derived through an analysis of the conditional second moment. Additionally, we propose an efficient approximate algorithm that significantly reduces running time.
APA
Huang, D. & Yang, P.. (2025). Sample Complexity of Correlation Detection in the Gaussian Wigner Model. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:26020-26040 Available from https://proceedings.mlr.press/v267/huang25aq.html.

Related Material