Generalized Group Testing

Xiwei Cheng, Sidharth Jaggi, Qiaoqiao Zhou
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:10777-10835, 2022.

Abstract

In the problem of classical group testing one aims to identify a small subset (of size $d$) diseased individuals/defective items in a large population (of size $n$) via a minimal number of suitably-designed group tests on subsets of items, where the test outcome is positive iff the given test contains at least one defective item. Motivated by physical considerations, we consider a generalized setting that includes as special cases multiple other group-testing-like models in the literature. In our setting, which subsumes as special cases a variety of noiseless and noisy group-testing models in the literature, the test outcome is positive with probability $f(x)$, where $x$ is the number of defectives tested in a pool, and $f(\cdot)$ is an arbitrary {\it monotonically increasing} (stochastic) test function. Our main contributions are as follows. 1. We present a non-adaptive scheme that with probability $1-\varepsilon$ identifies all defective items. Our scheme requires at most ${\cal O}( H(f) d\log(n/\varepsilon))$ tests, where $H(f)$ is a suitably defined “sensitivity parameter" of $f(\cdot)$, and is never larger than ${\cal O}(d^{1+o(1)})$, but may be substantially smaller for many $f(\cdot)$. 2. We argue that any non-adaptive group testing scheme needs at least $\Omega (h(f) d\log(n/d))$ tests to ensure high reliability recovery. Here $h(f)$ is a suitably defined “concentration parameter" of $f(\cdot)$, and $h(f) \in \Omega{(1)}$. 3. We prove that our sample-complexity bounds for generalized group testing are information-theoretically near-optimal for a variety of sparse-recovery group-testing models in the literature. That is, for {\it any} “noisy" test function $f(\cdot)$ (i.e. $0< f(0) < f(d) <1$), and for a variety of “(one-sided) noiseless" test functions $f(\cdot)$ (i.e., either $f(0)=0$, or $f(d)=1$, or both) studied in the literature we show that $H(f)/h(f) \in \Theta(1)$. As a by-product we tightly characterize the heretofore open information-theoretic sample-complexity for the well-studied model of threshold group-testing. For general (near)-noiseless test functions $f(\cdot)$ we show that $H(f)/h(f) \in {\cal O}(d^{1+o(1)})$. We also demonstrate a “natural" test-function $f(\cdot)$ whose sample complexity scales “extremally" as $\Theta ( d^2\log(n))$, rather than $\Theta ( d\log(n))$ as in the case of classical group-testing. Some of our techniques may be of independent interest – in particular our achievability requires a delicate saddle-point approximation, and our impossibility proof relies on a novel bound relating the mutual information of pair of random variables with the mean and variance of a specific function, and we derive novel structural results about monotone functions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-cheng22a, title = { Generalized Group Testing }, author = {Cheng, Xiwei and Jaggi, Sidharth and Zhou, Qiaoqiao}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {10777--10835}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/cheng22a/cheng22a.pdf}, url = {https://proceedings.mlr.press/v151/cheng22a.html}, abstract = { In the problem of classical group testing one aims to identify a small subset (of size $d$) diseased individuals/defective items in a large population (of size $n$) via a minimal number of suitably-designed group tests on subsets of items, where the test outcome is positive iff the given test contains at least one defective item. Motivated by physical considerations, we consider a generalized setting that includes as special cases multiple other group-testing-like models in the literature. In our setting, which subsumes as special cases a variety of noiseless and noisy group-testing models in the literature, the test outcome is positive with probability $f(x)$, where $x$ is the number of defectives tested in a pool, and $f(\cdot)$ is an arbitrary {\it monotonically increasing} (stochastic) test function. Our main contributions are as follows. 1. We present a non-adaptive scheme that with probability $1-\varepsilon$ identifies all defective items. Our scheme requires at most ${\cal O}( H(f) d\log(n/\varepsilon))$ tests, where $H(f)$ is a suitably defined “sensitivity parameter" of $f(\cdot)$, and is never larger than ${\cal O}(d^{1+o(1)})$, but may be substantially smaller for many $f(\cdot)$. 2. We argue that any non-adaptive group testing scheme needs at least $\Omega (h(f) d\log(n/d))$ tests to ensure high reliability recovery. Here $h(f)$ is a suitably defined “concentration parameter" of $f(\cdot)$, and $h(f) \in \Omega{(1)}$. 3. We prove that our sample-complexity bounds for generalized group testing are information-theoretically near-optimal for a variety of sparse-recovery group-testing models in the literature. That is, for {\it any} “noisy" test function $f(\cdot)$ (i.e. $0< f(0) < f(d) <1$), and for a variety of “(one-sided) noiseless" test functions $f(\cdot)$ (i.e., either $f(0)=0$, or $f(d)=1$, or both) studied in the literature we show that $H(f)/h(f) \in \Theta(1)$. As a by-product we tightly characterize the heretofore open information-theoretic sample-complexity for the well-studied model of threshold group-testing. For general (near)-noiseless test functions $f(\cdot)$ we show that $H(f)/h(f) \in {\cal O}(d^{1+o(1)})$. We also demonstrate a “natural" test-function $f(\cdot)$ whose sample complexity scales “extremally" as $\Theta ( d^2\log(n))$, rather than $\Theta ( d\log(n))$ as in the case of classical group-testing. Some of our techniques may be of independent interest – in particular our achievability requires a delicate saddle-point approximation, and our impossibility proof relies on a novel bound relating the mutual information of pair of random variables with the mean and variance of a specific function, and we derive novel structural results about monotone functions. } }
Endnote
%0 Conference Paper %T Generalized Group Testing %A Xiwei Cheng %A Sidharth Jaggi %A Qiaoqiao Zhou %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-cheng22a %I PMLR %P 10777--10835 %U https://proceedings.mlr.press/v151/cheng22a.html %V 151 %X In the problem of classical group testing one aims to identify a small subset (of size $d$) diseased individuals/defective items in a large population (of size $n$) via a minimal number of suitably-designed group tests on subsets of items, where the test outcome is positive iff the given test contains at least one defective item. Motivated by physical considerations, we consider a generalized setting that includes as special cases multiple other group-testing-like models in the literature. In our setting, which subsumes as special cases a variety of noiseless and noisy group-testing models in the literature, the test outcome is positive with probability $f(x)$, where $x$ is the number of defectives tested in a pool, and $f(\cdot)$ is an arbitrary {\it monotonically increasing} (stochastic) test function. Our main contributions are as follows. 1. We present a non-adaptive scheme that with probability $1-\varepsilon$ identifies all defective items. Our scheme requires at most ${\cal O}( H(f) d\log(n/\varepsilon))$ tests, where $H(f)$ is a suitably defined “sensitivity parameter" of $f(\cdot)$, and is never larger than ${\cal O}(d^{1+o(1)})$, but may be substantially smaller for many $f(\cdot)$. 2. We argue that any non-adaptive group testing scheme needs at least $\Omega (h(f) d\log(n/d))$ tests to ensure high reliability recovery. Here $h(f)$ is a suitably defined “concentration parameter" of $f(\cdot)$, and $h(f) \in \Omega{(1)}$. 3. We prove that our sample-complexity bounds for generalized group testing are information-theoretically near-optimal for a variety of sparse-recovery group-testing models in the literature. That is, for {\it any} “noisy" test function $f(\cdot)$ (i.e. $0< f(0) < f(d) <1$), and for a variety of “(one-sided) noiseless" test functions $f(\cdot)$ (i.e., either $f(0)=0$, or $f(d)=1$, or both) studied in the literature we show that $H(f)/h(f) \in \Theta(1)$. As a by-product we tightly characterize the heretofore open information-theoretic sample-complexity for the well-studied model of threshold group-testing. For general (near)-noiseless test functions $f(\cdot)$ we show that $H(f)/h(f) \in {\cal O}(d^{1+o(1)})$. We also demonstrate a “natural" test-function $f(\cdot)$ whose sample complexity scales “extremally" as $\Theta ( d^2\log(n))$, rather than $\Theta ( d\log(n))$ as in the case of classical group-testing. Some of our techniques may be of independent interest – in particular our achievability requires a delicate saddle-point approximation, and our impossibility proof relies on a novel bound relating the mutual information of pair of random variables with the mean and variance of a specific function, and we derive novel structural results about monotone functions.
APA
Cheng, X., Jaggi, S. & Zhou, Q.. (2022). Generalized Group Testing . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:10777-10835 Available from https://proceedings.mlr.press/v151/cheng22a.html.

Related Material