Faster Algorithms for Testing under Conditional Sampling
Proceedings of The 28th Conference on Learning Theory, PMLR 40:607-636, 2015.
There has been considerable recent interest in distribution-tests whose run-time and sample requirements are sublinear in the domain-size k. We study two of the most important tests under the conditional-sampling model where each query specifies a subset S of the domain, and the response is a sample drawn from S according to the underlying distribution. For identity testing, which asks whether the underlying distribution equals a specific given distribution or ε-differs from it, we reduce the known time and sample complexities from \widetilde\mathcalO(ε^-4) to \widetilde\mathcalO(ε^-2), thereby matching the information theoretic lower bound. For closeness testing, which asks whether two distributions underlying observed data sets are equal or different, we reduce existing complexity from \widetilde\mathcalO(ε^-4 \log^5 k) to an even sub-logarithmic \widetilde\mathcalO(ε^-5 \log \log k) thus providing a better bound to an open problem in Bertinoro Workshop on Sublinear Algorithms (Fisher, 2014).