Collaborative non-parametric two-sample testing

Alejandro David De la Concha Duarte; Nicolas Vayatis; Argyris Kalogeratos

Collaborative non-parametric two-sample testing

Alejandro David De la Concha Duarte, Nicolas Vayatis, Argyris Kalogeratos

Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:838-846, 2025.

Abstract

Multiple two-sample test problem in a graph-structured setting is a common scenario in fields such as Spatial Statistics and Neuroscience. Each node $v$ in fixed graph deals with a two-sample testing problem between two node-specific probability density functions, $p_v$ and $q_v$. The goal is to identify nodes where the null hypothesis $p_v = q_v$ should be rejected, under the assumption that connected nodes would yield similar test outcomes. We propose the non-parametric collaborative two-sample testing (CTST) framework that efficiently leverages the graph structure and minimizes the assumptions over $p_v$ and $q_v$. CTST integrates elements from f-divergence estimation, Kernel Methods, and Multitask Learning. We use synthetic experiments and a real sensor network detecting seismic activity to demonstrate that CTST outperforms state-of-the-art non-parametric statistical tests that apply at each node independently, hence disregard the geometry of the problem.

Cite this Paper

BibTeX

@InProceedings{pmlr-v258-concha-duarte25a,
  title = 	 {Collaborative non-parametric two-sample testing},
  author =       {la Concha Duarte, Alejandro David De and Vayatis, Nicolas and Kalogeratos, Argyris},
  booktitle = 	 {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {838--846},
  year = 	 {2025},
  editor = 	 {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz},
  volume = 	 {258},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {03--05 May},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v258/main/assets/concha-duarte25a/concha-duarte25a.pdf},
  url = 	 {https://proceedings.mlr.press/v258/concha-duarte25a.html},
  abstract = 	 {Multiple two-sample test problem in a graph-structured setting is a common scenario in fields such as Spatial Statistics and Neuroscience. Each node $v$ in fixed graph deals with a two-sample testing problem between two node-specific probability density functions, $p_v$ and $q_v$. The goal is to identify nodes where the null hypothesis $p_v = q_v$ should be rejected, under the assumption that connected nodes would yield similar test outcomes. We propose the non-parametric collaborative two-sample testing (CTST) framework that efficiently leverages the graph structure and minimizes the assumptions over $p_v$ and $q_v$. CTST integrates elements from f-divergence estimation, Kernel Methods, and Multitask Learning. We use synthetic experiments and a real sensor network detecting seismic activity to demonstrate that CTST outperforms state-of-the-art non-parametric statistical tests that apply at each node independently, hence disregard the geometry of the problem.}
}

Endnote

%0 Conference Paper
%T Collaborative non-parametric two-sample testing
%A Alejandro David De la Concha Duarte
%A Nicolas Vayatis
%A Argyris Kalogeratos
%B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2025
%E Yingzhen Li
%E Stephan Mandt
%E Shipra Agrawal
%E Emtiyaz Khan	
%F pmlr-v258-concha-duarte25a
%I PMLR
%P 838--846
%U https://proceedings.mlr.press/v258/concha-duarte25a.html
%V 258
%X Multiple two-sample test problem in a graph-structured setting is a common scenario in fields such as Spatial Statistics and Neuroscience. Each node $v$ in fixed graph deals with a two-sample testing problem between two node-specific probability density functions, $p_v$ and $q_v$. The goal is to identify nodes where the null hypothesis $p_v = q_v$ should be rejected, under the assumption that connected nodes would yield similar test outcomes. We propose the non-parametric collaborative two-sample testing (CTST) framework that efficiently leverages the graph structure and minimizes the assumptions over $p_v$ and $q_v$. CTST integrates elements from f-divergence estimation, Kernel Methods, and Multitask Learning. We use synthetic experiments and a real sensor network detecting seismic activity to demonstrate that CTST outperforms state-of-the-art non-parametric statistical tests that apply at each node independently, hence disregard the geometry of the problem.

APA

la Concha Duarte, A.D.D., Vayatis, N. & Kalogeratos, A.. (2025). Collaborative non-parametric two-sample testing. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:838-846 Available from https://proceedings.mlr.press/v258/concha-duarte25a.html.

Collaborative non-parametric two-sample testing

Abstract

Cite this Paper

Related Material