Two-Sample Tests for Inhomogeneous Random Graphs in $L_r$ Norm: Optimality and Asymptotics

Sayak Chatterjee, Dibyendu Saha, Soham Dan, Bhaswar B. Bhattacharya
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:6903-6911, 2023.

Abstract

In this paper we study the two-sample problem for inhomogeneous Erdős-Rényi (IER), random graph models, in the $L_r$ norm, in the high-dimensional regime where the number of samples is smaller or comparable to the size of the graphs. Given two symmetric matrices $P, Q \in [0, 1]^{n \times n}$ (with zeros on the diagonals), the two-sample problem for IER graphs (with respect to the $L_r$ norm $||\cdot||_r$) is to test the hypothesis $H_0: P=Q$ versus $H_1: ||P-Q||_r \geq \varepsilon$, given a sample of $m$ graphs from the respective distributions. In this paper, we obtain the optimal sample complexity for testing in the $L_r$-norm, for all integers $r \geq 1$. We also derive the asymptotic distribution of the optimal tests under $H_0$ and develop a method for consistently estimating their variances. This allows us to efficiently implement the optimal tests with precise asymptotic level and establish their asymptotic consistency. We validate our theoretical results by numerical experiments for various natural IER models.

Cite this Paper


BibTeX
@InProceedings{pmlr-v206-chatterjee23a, title = {Two-Sample Tests for Inhomogeneous Random Graphs in $L_r$ Norm: Optimality and Asymptotics}, author = {Chatterjee, Sayak and Saha, Dibyendu and Dan, Soham and Bhattacharya, Bhaswar B.}, booktitle = {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics}, pages = {6903--6911}, year = {2023}, editor = {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem}, volume = {206}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v206/chatterjee23a/chatterjee23a.pdf}, url = {https://proceedings.mlr.press/v206/chatterjee23a.html}, abstract = {In this paper we study the two-sample problem for inhomogeneous Erdős-Rényi (IER), random graph models, in the $L_r$ norm, in the high-dimensional regime where the number of samples is smaller or comparable to the size of the graphs. Given two symmetric matrices $P, Q \in [0, 1]^{n \times n}$ (with zeros on the diagonals), the two-sample problem for IER graphs (with respect to the $L_r$ norm $||\cdot||_r$) is to test the hypothesis $H_0: P=Q$ versus $H_1: ||P-Q||_r \geq \varepsilon$, given a sample of $m$ graphs from the respective distributions. In this paper, we obtain the optimal sample complexity for testing in the $L_r$-norm, for all integers $r \geq 1$. We also derive the asymptotic distribution of the optimal tests under $H_0$ and develop a method for consistently estimating their variances. This allows us to efficiently implement the optimal tests with precise asymptotic level and establish their asymptotic consistency. We validate our theoretical results by numerical experiments for various natural IER models.} }
Endnote
%0 Conference Paper %T Two-Sample Tests for Inhomogeneous Random Graphs in $L_r$ Norm: Optimality and Asymptotics %A Sayak Chatterjee %A Dibyendu Saha %A Soham Dan %A Bhaswar B. Bhattacharya %B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2023 %E Francisco Ruiz %E Jennifer Dy %E Jan-Willem van de Meent %F pmlr-v206-chatterjee23a %I PMLR %P 6903--6911 %U https://proceedings.mlr.press/v206/chatterjee23a.html %V 206 %X In this paper we study the two-sample problem for inhomogeneous Erdős-Rényi (IER), random graph models, in the $L_r$ norm, in the high-dimensional regime where the number of samples is smaller or comparable to the size of the graphs. Given two symmetric matrices $P, Q \in [0, 1]^{n \times n}$ (with zeros on the diagonals), the two-sample problem for IER graphs (with respect to the $L_r$ norm $||\cdot||_r$) is to test the hypothesis $H_0: P=Q$ versus $H_1: ||P-Q||_r \geq \varepsilon$, given a sample of $m$ graphs from the respective distributions. In this paper, we obtain the optimal sample complexity for testing in the $L_r$-norm, for all integers $r \geq 1$. We also derive the asymptotic distribution of the optimal tests under $H_0$ and develop a method for consistently estimating their variances. This allows us to efficiently implement the optimal tests with precise asymptotic level and establish their asymptotic consistency. We validate our theoretical results by numerical experiments for various natural IER models.
APA
Chatterjee, S., Saha, D., Dan, S. & Bhattacharya, B.B.. (2023). Two-Sample Tests for Inhomogeneous Random Graphs in $L_r$ Norm: Optimality and Asymptotics. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:6903-6911 Available from https://proceedings.mlr.press/v206/chatterjee23a.html.

Related Material