Testing Exchangeability for Multiple Sequences of P-values

Henrik Boström
Proceedings of the Fourteenth Symposium on Conformal and Probabilistic Prediction with Applications, PMLR 266:615-632, 2025.

Abstract

Given a sequence of p-values, conformal test martingales can be used for signaling that the exchangeability assumption is violated, while the false alarm rate is controlled by a user- specified significance level. In some scenarios, multiple p-values are observed at each time step, e.g., p-values may be received from multiple conformal predictors for a single target, or p-values are obtained for multiple targets. In such cases, signaling whenever a violation is detected for any of the sequences, leads to an increased risk of false alarms. Bonferroni correction, which is a standard approach to controlling the error rate when testing multiple hypotheses, is shown to be dominated by the straightforward approach of forming a single conformal test martingale from the martingales generated from the individual sequences of p-values. In addition to testing exchangeability for the individual sequences, approaches for testing them jointly are also investigated. For the latter, the use of aggregation operators to transform multiple sequences of p-values into a single sequence is investigated, as well as a previously proposed approach for detecting covariate shift. Experimental results are presented, highlighting the potential strengths and weaknesses of the different approaches.

Cite this Paper


BibTeX
@InProceedings{pmlr-v266-bostrom25a, title = {Testing Exchangeability for Multiple Sequences of P-values}, author = {Bostr\"{o}m, Henrik}, booktitle = {Proceedings of the Fourteenth Symposium on Conformal and Probabilistic Prediction with Applications}, pages = {615--632}, year = {2025}, editor = {Nguyen, Khuong An and Luo, Zhiyuan and Papadopoulos, Harris and Löfström, Tuwe and Carlsson, Lars and Boström, Henrik}, volume = {266}, series = {Proceedings of Machine Learning Research}, month = {10--12 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v266/main/assets/bostrom25a/bostrom25a.pdf}, url = {https://proceedings.mlr.press/v266/bostrom25a.html}, abstract = {Given a sequence of p-values, conformal test martingales can be used for signaling that the exchangeability assumption is violated, while the false alarm rate is controlled by a user- specified significance level. In some scenarios, multiple p-values are observed at each time step, e.g., p-values may be received from multiple conformal predictors for a single target, or p-values are obtained for multiple targets. In such cases, signaling whenever a violation is detected for any of the sequences, leads to an increased risk of false alarms. Bonferroni correction, which is a standard approach to controlling the error rate when testing multiple hypotheses, is shown to be dominated by the straightforward approach of forming a single conformal test martingale from the martingales generated from the individual sequences of p-values. In addition to testing exchangeability for the individual sequences, approaches for testing them jointly are also investigated. For the latter, the use of aggregation operators to transform multiple sequences of p-values into a single sequence is investigated, as well as a previously proposed approach for detecting covariate shift. Experimental results are presented, highlighting the potential strengths and weaknesses of the different approaches.} }
Endnote
%0 Conference Paper %T Testing Exchangeability for Multiple Sequences of P-values %A Henrik Boström %B Proceedings of the Fourteenth Symposium on Conformal and Probabilistic Prediction with Applications %C Proceedings of Machine Learning Research %D 2025 %E Khuong An Nguyen %E Zhiyuan Luo %E Harris Papadopoulos %E Tuwe Löfström %E Lars Carlsson %E Henrik Boström %F pmlr-v266-bostrom25a %I PMLR %P 615--632 %U https://proceedings.mlr.press/v266/bostrom25a.html %V 266 %X Given a sequence of p-values, conformal test martingales can be used for signaling that the exchangeability assumption is violated, while the false alarm rate is controlled by a user- specified significance level. In some scenarios, multiple p-values are observed at each time step, e.g., p-values may be received from multiple conformal predictors for a single target, or p-values are obtained for multiple targets. In such cases, signaling whenever a violation is detected for any of the sequences, leads to an increased risk of false alarms. Bonferroni correction, which is a standard approach to controlling the error rate when testing multiple hypotheses, is shown to be dominated by the straightforward approach of forming a single conformal test martingale from the martingales generated from the individual sequences of p-values. In addition to testing exchangeability for the individual sequences, approaches for testing them jointly are also investigated. For the latter, the use of aggregation operators to transform multiple sequences of p-values into a single sequence is investigated, as well as a previously proposed approach for detecting covariate shift. Experimental results are presented, highlighting the potential strengths and weaknesses of the different approaches.
APA
Boström, H.. (2025). Testing Exchangeability for Multiple Sequences of P-values. Proceedings of the Fourteenth Symposium on Conformal and Probabilistic Prediction with Applications, in Proceedings of Machine Learning Research 266:615-632 Available from https://proceedings.mlr.press/v266/bostrom25a.html.

Related Material