Generalized Group Testing

Xiwei Cheng; Sidharth Jaggi; Qiaoqiao Zhou

Generalized Group Testing

Xiwei Cheng, Sidharth Jaggi, Qiaoqiao Zhou

Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:10777-10835, 2022.

Abstract

In the problem of classical group testing one aims to identify a small subset (of size

$d$ ) diseased individuals/defective items in a large population (of size

$n$ ) via a minimal number of suitably-designed group tests on subsets of items, where the test outcome is positive iff the given test contains at least one defective item. Motivated by physical considerations, we consider a generalized setting that includes as special cases multiple other group-testing-like models in the literature. In our setting, which subsumes as special cases a variety of noiseless and noisy group-testing models in the literature, the test outcome is positive with probability

$f(x)$ , where

$x$ is the number of defectives tested in a pool, and

$f(\cdot)$ is an arbitrary {\it monotonically increasing} (stochastic) test function. Our main contributions are as follows. 1. We present a non-adaptive scheme that with probability

$1-\varepsilon$ identifies all defective items. Our scheme requires at most

${\cal O}( H(f) d\log(n/\varepsilon))$ tests, where

$H(f)$ is a suitably defined “sensitivity parameter" of

$f(\cdot)$ , and is never larger than

${\cal O}(d^{1+o(1)})$ , but may be substantially smaller for many

$f(\cdot)$ . 2. We argue that any non-adaptive group testing scheme needs at least

$\Omega (h(f) d\log(n/d))$ tests to ensure high reliability recovery. Here

$h(f)$ is a suitably defined “concentration parameter" of

$f(\cdot)$ , and

$h(f) \in \Omega{(1)}$ . 3. We prove that our sample-complexity bounds for generalized group testing are information-theoretically near-optimal for a variety of sparse-recovery group-testing models in the literature. That is, for {\it any} “noisy" test function

$f(\cdot)$ (i.e.

$0< f(0) < f(d) <1$ ), and for a variety of “(one-sided) noiseless" test functions

$f(\cdot)$ (i.e., either

$f(0)=0$ , or

$f(d)=1$ , or both) studied in the literature we show that

$H(f)/h(f) \in \Theta(1)$ . As a by-product we tightly characterize the heretofore open information-theoretic sample-complexity for the well-studied model of threshold group-testing. For general (near)-noiseless test functions

$f(\cdot)$ we show that

$H(f)/h(f) \in {\cal O}(d^{1+o(1)})$ . We also demonstrate a “natural" test-function

$f(\cdot)$ whose sample complexity scales “extremally" as

$\Theta ( d^2\log(n))$ , rather than

$\Theta ( d\log(n))$ as in the case of classical group-testing. Some of our techniques may be of independent interest – in particular our achievability requires a delicate saddle-point approximation, and our impossibility proof relies on a novel bound relating the mutual information of pair of random variables with the mean and variance of a specific function, and we derive novel structural results about monotone functions.

Cite this Paper

BibTeX


@InProceedings{pmlr-v151-cheng22a,
  title = 	 { Generalized Group Testing },
  author =       {Cheng, Xiwei and Jaggi, Sidharth and Zhou, Qiaoqiao},
  booktitle = 	 {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {10777--10835},
  year = 	 {2022},
  editor = 	 {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel},
  volume = 	 {151},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {28--30 Mar},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v151/cheng22a/cheng22a.pdf},
  url = 	 {https://proceedings.mlr.press/v151/cheng22a.html},
  abstract = 	 { In the problem of classical group testing one aims to identify a small subset (of size $d$) diseased individuals/defective items in a large population (of size $n$) via a minimal number of suitably-designed group tests on subsets of items, where the test outcome is positive iff the given test contains at least one defective item. Motivated by physical considerations, we consider a generalized setting that includes as special cases multiple other group-testing-like models in the literature. In our setting, which subsumes as special cases a variety of noiseless and noisy group-testing models in the literature, the test outcome is positive with probability $f(x)$, where $x$ is the number of defectives tested in a pool, and $f(\cdot)$ is an arbitrary {\it monotonically increasing} (stochastic) test function. Our main contributions are as follows. 1. We present a non-adaptive scheme that with probability $1-\varepsilon$ identifies all defective items. Our scheme requires at most ${\cal O}( H(f) d\log(n/\varepsilon))$ tests, where $H(f)$ is a suitably defined “sensitivity parameter" of $f(\cdot)$, and is never larger than ${\cal O}(d^{1+o(1)})$, but may be substantially smaller for many $f(\cdot)$. 2. We argue that any non-adaptive group testing scheme needs at least $\Omega (h(f) d\log(n/d))$ tests to ensure high reliability recovery. Here $h(f)$ is a suitably defined “concentration parameter" of $f(\cdot)$, and $h(f) \in \Omega{(1)}$. 3. We prove that our sample-complexity bounds for generalized group testing are information-theoretically near-optimal for a variety of sparse-recovery group-testing models in the literature. That is, for {\it any} “noisy" test function $f(\cdot)$ (i.e. $0< f(0) < f(d) <1$), and for a variety of “(one-sided) noiseless" test functions $f(\cdot)$ (i.e., either $f(0)=0$, or $f(d)=1$, or both) studied in the literature we show that $H(f)/h(f) \in \Theta(1)$. As a by-product we tightly characterize the heretofore open information-theoretic sample-complexity for the well-studied model of threshold group-testing. For general (near)-noiseless test functions $f(\cdot)$ we show that $H(f)/h(f) \in {\cal O}(d^{1+o(1)})$. We also demonstrate a “natural" test-function $f(\cdot)$ whose sample complexity scales “extremally" as $\Theta ( d^2\log(n))$, rather than $\Theta ( d\log(n))$ as in the case of classical group-testing. Some of our techniques may be of independent interest – in particular our achievability requires a delicate saddle-point approximation, and our impossibility proof relies on a novel bound relating the mutual information of pair of random variables with the mean and variance of a specific function, and we derive novel structural results about monotone functions. }
}

Endnote

%0 Conference Paper
%T  Generalized Group Testing 
%A Xiwei Cheng
%A Sidharth Jaggi
%A Qiaoqiao Zhou
%B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2022
%E Gustau Camps-Valls
%E Francisco J. R. Ruiz
%E Isabel Valera	
%F pmlr-v151-cheng22a
%I PMLR
%P 10777--10835
%U https://proceedings.mlr.press/v151/cheng22a.html
%V 151
%X  In the problem of classical group testing one aims to identify a small subset (of size $d$) diseased individuals/defective items in a large population (of size $n$) via a minimal number of suitably-designed group tests on subsets of items, where the test outcome is positive iff the given test contains at least one defective item. Motivated by physical considerations, we consider a generalized setting that includes as special cases multiple other group-testing-like models in the literature. In our setting, which subsumes as special cases a variety of noiseless and noisy group-testing models in the literature, the test outcome is positive with probability $f(x)$, where $x$ is the number of defectives tested in a pool, and $f(\cdot)$ is an arbitrary {\it monotonically increasing} (stochastic) test function. Our main contributions are as follows. 1. We present a non-adaptive scheme that with probability $1-\varepsilon$ identifies all defective items. Our scheme requires at most ${\cal O}( H(f) d\log(n/\varepsilon))$ tests, where $H(f)$ is a suitably defined “sensitivity parameter" of $f(\cdot)$, and is never larger than ${\cal O}(d^{1+o(1)})$, but may be substantially smaller for many $f(\cdot)$. 2. We argue that any non-adaptive group testing scheme needs at least $\Omega (h(f) d\log(n/d))$ tests to ensure high reliability recovery. Here $h(f)$ is a suitably defined “concentration parameter" of $f(\cdot)$, and $h(f) \in \Omega{(1)}$. 3. We prove that our sample-complexity bounds for generalized group testing are information-theoretically near-optimal for a variety of sparse-recovery group-testing models in the literature. That is, for {\it any} “noisy" test function $f(\cdot)$ (i.e. $0< f(0) < f(d) <1$), and for a variety of “(one-sided) noiseless" test functions $f(\cdot)$ (i.e., either $f(0)=0$, or $f(d)=1$, or both) studied in the literature we show that $H(f)/h(f) \in \Theta(1)$. As a by-product we tightly characterize the heretofore open information-theoretic sample-complexity for the well-studied model of threshold group-testing. For general (near)-noiseless test functions $f(\cdot)$ we show that $H(f)/h(f) \in {\cal O}(d^{1+o(1)})$. We also demonstrate a “natural" test-function $f(\cdot)$ whose sample complexity scales “extremally" as $\Theta ( d^2\log(n))$, rather than $\Theta ( d\log(n))$ as in the case of classical group-testing. Some of our techniques may be of independent interest – in particular our achievability requires a delicate saddle-point approximation, and our impossibility proof relies on a novel bound relating the mutual information of pair of random variables with the mean and variance of a specific function, and we derive novel structural results about monotone functions.

APA


Cheng, X., Jaggi, S. & Zhou, Q.. (2022).  Generalized Group Testing . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:10777-10835 Available from https://proceedings.mlr.press/v151/cheng22a.html.

Generalized Group Testing

Abstract

Cite this Paper

Related Material