Computational Bounds on Statistical Query Learning

Vitaly Feldman, Varun Kanade
Proceedings of the 25th Annual Conference on Learning Theory, PMLR 23:16.1-16.22, 2012.

Abstract

We study the complexity of learning in Kearns’ well-known \emphstatistical query (SQ) learning model (Kearns, 1993). A number of previous works have addressed the definition and estimation of the information-theoretic bounds on the SQ learning complexity, in other words, bounds on the query complexity. Here we give the first strictly computational upper and lower bounds on the complexity of several types of learning in the SQ model. As it was already observed, the known characterization of distribution-specific SQ learning (Blum, et al. 1994) implies that for weak learning over a fixed distribution, the query complexity and computational complexity are essentially the same. In contrast, we show that for both distribution-specific and distribution-independent (strong) learning there exists a concept class of polynomial query complexity that is not efficiently learnable unless RP = NP. We then prove that our distribution-specific lower bound is essentially tight by showing that for every concept class \emphC of polynomial query complexity there exists a polynomial time algorithm that given access to random points from any distribution \emphD and an NP oracle, can SQ learn \emphC over \emphD. We also consider a restriction of the SQ model, the correlational statistical query (CSQ) model (Bshouty and Feldman, 2001; Feldman, 2008) of learning which is closely-related to Valiant’s model of evolvability (Valiant, 2007). We show a similar separation result for distribution-independent CSQ learning under a stronger assumption: there exists a concept class of polynomial CSQ query complexity which is not efficiently learnable unless every problem in W[P] has a randomized fixed parameter tractable algorithm.

Cite this Paper


BibTeX
@InProceedings{pmlr-v23-feldman12a, title = {Computational Bounds on Statistical Query Learning}, author = {Feldman, Vitaly and Kanade, Varun}, booktitle = {Proceedings of the 25th Annual Conference on Learning Theory}, pages = {16.1--16.22}, year = {2012}, editor = {Mannor, Shie and Srebro, Nathan and Williamson, Robert C.}, volume = {23}, series = {Proceedings of Machine Learning Research}, address = {Edinburgh, Scotland}, month = {25--27 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v23/feldman12a/feldman12a.pdf}, url = {https://proceedings.mlr.press/v23/feldman12a.html}, abstract = {We study the complexity of learning in Kearns’ well-known \emphstatistical query (SQ) learning model (Kearns, 1993). A number of previous works have addressed the definition and estimation of the information-theoretic bounds on the SQ learning complexity, in other words, bounds on the query complexity. Here we give the first strictly computational upper and lower bounds on the complexity of several types of learning in the SQ model. As it was already observed, the known characterization of distribution-specific SQ learning (Blum, et al. 1994) implies that for weak learning over a fixed distribution, the query complexity and computational complexity are essentially the same. In contrast, we show that for both distribution-specific and distribution-independent (strong) learning there exists a concept class of polynomial query complexity that is not efficiently learnable unless RP = NP. We then prove that our distribution-specific lower bound is essentially tight by showing that for every concept class \emphC of polynomial query complexity there exists a polynomial time algorithm that given access to random points from any distribution \emphD and an NP oracle, can SQ learn \emphC over \emphD. We also consider a restriction of the SQ model, the correlational statistical query (CSQ) model (Bshouty and Feldman, 2001; Feldman, 2008) of learning which is closely-related to Valiant’s model of evolvability (Valiant, 2007). We show a similar separation result for distribution-independent CSQ learning under a stronger assumption: there exists a concept class of polynomial CSQ query complexity which is not efficiently learnable unless every problem in W[P] has a randomized fixed parameter tractable algorithm.} }
Endnote
%0 Conference Paper %T Computational Bounds on Statistical Query Learning %A Vitaly Feldman %A Varun Kanade %B Proceedings of the 25th Annual Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2012 %E Shie Mannor %E Nathan Srebro %E Robert C. Williamson %F pmlr-v23-feldman12a %I PMLR %P 16.1--16.22 %U https://proceedings.mlr.press/v23/feldman12a.html %V 23 %X We study the complexity of learning in Kearns’ well-known \emphstatistical query (SQ) learning model (Kearns, 1993). A number of previous works have addressed the definition and estimation of the information-theoretic bounds on the SQ learning complexity, in other words, bounds on the query complexity. Here we give the first strictly computational upper and lower bounds on the complexity of several types of learning in the SQ model. As it was already observed, the known characterization of distribution-specific SQ learning (Blum, et al. 1994) implies that for weak learning over a fixed distribution, the query complexity and computational complexity are essentially the same. In contrast, we show that for both distribution-specific and distribution-independent (strong) learning there exists a concept class of polynomial query complexity that is not efficiently learnable unless RP = NP. We then prove that our distribution-specific lower bound is essentially tight by showing that for every concept class \emphC of polynomial query complexity there exists a polynomial time algorithm that given access to random points from any distribution \emphD and an NP oracle, can SQ learn \emphC over \emphD. We also consider a restriction of the SQ model, the correlational statistical query (CSQ) model (Bshouty and Feldman, 2001; Feldman, 2008) of learning which is closely-related to Valiant’s model of evolvability (Valiant, 2007). We show a similar separation result for distribution-independent CSQ learning under a stronger assumption: there exists a concept class of polynomial CSQ query complexity which is not efficiently learnable unless every problem in W[P] has a randomized fixed parameter tractable algorithm.
RIS
TY - CPAPER TI - Computational Bounds on Statistical Query Learning AU - Vitaly Feldman AU - Varun Kanade BT - Proceedings of the 25th Annual Conference on Learning Theory DA - 2012/06/16 ED - Shie Mannor ED - Nathan Srebro ED - Robert C. Williamson ID - pmlr-v23-feldman12a PB - PMLR DP - Proceedings of Machine Learning Research VL - 23 SP - 16.1 EP - 16.22 L1 - http://proceedings.mlr.press/v23/feldman12a/feldman12a.pdf UR - https://proceedings.mlr.press/v23/feldman12a.html AB - We study the complexity of learning in Kearns’ well-known \emphstatistical query (SQ) learning model (Kearns, 1993). A number of previous works have addressed the definition and estimation of the information-theoretic bounds on the SQ learning complexity, in other words, bounds on the query complexity. Here we give the first strictly computational upper and lower bounds on the complexity of several types of learning in the SQ model. As it was already observed, the known characterization of distribution-specific SQ learning (Blum, et al. 1994) implies that for weak learning over a fixed distribution, the query complexity and computational complexity are essentially the same. In contrast, we show that for both distribution-specific and distribution-independent (strong) learning there exists a concept class of polynomial query complexity that is not efficiently learnable unless RP = NP. We then prove that our distribution-specific lower bound is essentially tight by showing that for every concept class \emphC of polynomial query complexity there exists a polynomial time algorithm that given access to random points from any distribution \emphD and an NP oracle, can SQ learn \emphC over \emphD. We also consider a restriction of the SQ model, the correlational statistical query (CSQ) model (Bshouty and Feldman, 2001; Feldman, 2008) of learning which is closely-related to Valiant’s model of evolvability (Valiant, 2007). We show a similar separation result for distribution-independent CSQ learning under a stronger assumption: there exists a concept class of polynomial CSQ query complexity which is not efficiently learnable unless every problem in W[P] has a randomized fixed parameter tractable algorithm. ER -
APA
Feldman, V. & Kanade, V.. (2012). Computational Bounds on Statistical Query Learning. Proceedings of the 25th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 23:16.1-16.22 Available from https://proceedings.mlr.press/v23/feldman12a.html.

Related Material