[edit]

# Generalized Group Testing

*Proceedings of The 25th International Conference on Artificial Intelligence and Statistics*, PMLR 151:10777-10835, 2022.

#### Abstract

In the problem of classical group testing one aims to identify a small subset (of size $d$) diseased individuals/defective items in a large population (of size $n$) via a minimal number of suitably-designed group tests on subsets of items, where the test outcome is positive iff the given test contains at least one defective item. Motivated by physical considerations, we consider a generalized setting that includes as special cases multiple other group-testing-like models in the literature. In our setting, which subsumes as special cases a variety of noiseless and noisy group-testing models in the literature, the test outcome is positive with probability $f(x)$, where $x$ is the number of defectives tested in a pool, and $f(\cdot)$ is an arbitrary {\it monotonically increasing} (stochastic) test function. Our main contributions are as follows. 1. We present a non-adaptive scheme that with probability $1-\varepsilon$ identifies all defective items. Our scheme requires at most ${\cal O}( H(f) d\log(n/\varepsilon))$ tests, where $H(f)$ is a suitably defined “sensitivity parameter" of $f(\cdot)$, and is never larger than ${\cal O}(d^{1+o(1)})$, but may be substantially smaller for many $f(\cdot)$. 2. We argue that any non-adaptive group testing scheme needs at least $\Omega (h(f) d\log(n/d))$ tests to ensure high reliability recovery. Here $h(f)$ is a suitably defined “concentration parameter" of $f(\cdot)$, and $h(f) \in \Omega{(1)}$. 3. We prove that our sample-complexity bounds for generalized group testing are information-theoretically near-optimal for a variety of sparse-recovery group-testing models in the literature. That is, for {\it any} “noisy" test function $f(\cdot)$ (i.e. $0< f(0) < f(d) <1$), and for a variety of “(one-sided) noiseless" test functions $f(\cdot)$ (i.e., either $f(0)=0$, or $f(d)=1$, or both) studied in the literature we show that $H(f)/h(f) \in \Theta(1)$. As a by-product we tightly characterize the heretofore open information-theoretic sample-complexity for the well-studied model of threshold group-testing. For general (near)-noiseless test functions $f(\cdot)$ we show that $H(f)/h(f) \in {\cal O}(d^{1+o(1)})$. We also demonstrate a “natural" test-function $f(\cdot)$ whose sample complexity scales “extremally" as $\Theta ( d^2\log(n))$, rather than $\Theta ( d\log(n))$ as in the case of classical group-testing. Some of our techniques may be of independent interest – in particular our achievability requires a delicate saddle-point approximation, and our impossibility proof relies on a novel bound relating the mutual information of pair of random variables with the mean and variance of a specific function, and we derive novel structural results about monotone functions.