Deep anytime-valid hypothesis testing

Teodora Pandeva, Patrick Forré, Aaditya Ramdas, Shubhanshu Shekhar
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:622-630, 2024.

Abstract

We propose a general framework for constructing powerful, sequential hypothesis tests for a large class of nonparametric testing problems. The null hypothesis for these problems is defined in an abstract form using the action of two known operators on the data distribution. This abstraction allows for a unified treatment of several classical tasks, such as two-sample testing, independence testing, and conditional-independence testing, as well as modern problems, such as testing for adversarial robustness of machine learning (ML) models. Our proposed framework has the following advantages over classical batch tests: 1) it continuously monitors online data streams and efficiently aggregates evidence against the null, 2) it provides tight control over the type I error without the need for multiple testing correction, 3) it adapts the sample size requirement to the unknown hardness of the problem. We develop a principled approach of leveraging the representation capability of ML models within the testing-by-betting framework, a game-theoretic approach for designing sequential tests. Empirical results on synthetic and real-world datasets demonstrate that tests instantiated using our general framework are competitive against specialized baselines on several tasks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-pandeva24a, title = { Deep anytime-valid hypothesis testing }, author = {Pandeva, Teodora and Forr\'{e}, Patrick and Ramdas, Aaditya and Shekhar, Shubhanshu}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {622--630}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/pandeva24a/pandeva24a.pdf}, url = {https://proceedings.mlr.press/v238/pandeva24a.html}, abstract = { We propose a general framework for constructing powerful, sequential hypothesis tests for a large class of nonparametric testing problems. The null hypothesis for these problems is defined in an abstract form using the action of two known operators on the data distribution. This abstraction allows for a unified treatment of several classical tasks, such as two-sample testing, independence testing, and conditional-independence testing, as well as modern problems, such as testing for adversarial robustness of machine learning (ML) models. Our proposed framework has the following advantages over classical batch tests: 1) it continuously monitors online data streams and efficiently aggregates evidence against the null, 2) it provides tight control over the type I error without the need for multiple testing correction, 3) it adapts the sample size requirement to the unknown hardness of the problem. We develop a principled approach of leveraging the representation capability of ML models within the testing-by-betting framework, a game-theoretic approach for designing sequential tests. Empirical results on synthetic and real-world datasets demonstrate that tests instantiated using our general framework are competitive against specialized baselines on several tasks. } }
Endnote
%0 Conference Paper %T Deep anytime-valid hypothesis testing %A Teodora Pandeva %A Patrick Forré %A Aaditya Ramdas %A Shubhanshu Shekhar %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-pandeva24a %I PMLR %P 622--630 %U https://proceedings.mlr.press/v238/pandeva24a.html %V 238 %X We propose a general framework for constructing powerful, sequential hypothesis tests for a large class of nonparametric testing problems. The null hypothesis for these problems is defined in an abstract form using the action of two known operators on the data distribution. This abstraction allows for a unified treatment of several classical tasks, such as two-sample testing, independence testing, and conditional-independence testing, as well as modern problems, such as testing for adversarial robustness of machine learning (ML) models. Our proposed framework has the following advantages over classical batch tests: 1) it continuously monitors online data streams and efficiently aggregates evidence against the null, 2) it provides tight control over the type I error without the need for multiple testing correction, 3) it adapts the sample size requirement to the unknown hardness of the problem. We develop a principled approach of leveraging the representation capability of ML models within the testing-by-betting framework, a game-theoretic approach for designing sequential tests. Empirical results on synthetic and real-world datasets demonstrate that tests instantiated using our general framework are competitive against specialized baselines on several tasks.
APA
Pandeva, T., Forré, P., Ramdas, A. & Shekhar, S.. (2024). Deep anytime-valid hypothesis testing . Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:622-630 Available from https://proceedings.mlr.press/v238/pandeva24a.html.

Related Material