Computational Bounds on Statistical Query Learning
[edit]
Proceedings of the 25th Annual Conference on Learning Theory, PMLR 23:16.116.22, 2012.
Abstract
We study the complexity of learning in Kearns’ wellknown \emphstatistical query (SQ) learning model (Kearns, 1993). A number of previous works have addressed the definition and estimation of the informationtheoretic bounds on the SQ learning complexity, in other words, bounds on the query complexity. Here we give the first strictly computational upper and lower bounds on the complexity of several types of learning in the SQ model. As it was already observed, the known characterization of distributionspecific SQ learning (Blum, et al. 1994) implies that for weak learning over a fixed distribution, the query complexity and computational complexity are essentially the same. In contrast, we show that for both distributionspecific and distributionindependent (strong) learning there exists a concept class of polynomial query complexity that is not efficiently learnable unless RP = NP. We then prove that our distributionspecific lower bound is essentially tight by showing that for every concept class \emphC of polynomial query complexity there exists a polynomial time algorithm that given access to random points from any distribution \emphD and an NP oracle, can SQ learn \emphC over \emphD. We also consider a restriction of the SQ model, the correlational statistical query (CSQ) model (Bshouty and Feldman, 2001; Feldman, 2008) of learning which is closelyrelated to Valiant’s model of evolvability (Valiant, 2007). We show a similar separation result for distributionindependent CSQ learning under a stronger assumption: there exists a concept class of polynomial CSQ query complexity which is not efficiently learnable unless every problem in W[P] has a randomized fixed parameter tractable algorithm.
Related Material



