Agnostic KWIK learning and efficient approximate reinforcement learning

István Szita; Csaba Szepesvári

Agnostic KWIK learning and efficient approximate reinforcement learning

István Szita, Csaba Szepesvári

Proceedings of the 24th Annual Conference on Learning Theory, PMLR 19:739-772, 2011.

Abstract

A popular approach in reinforcement learning is to use a model-based algorithm, i.e., an algorithm that utilizes a model learner to learn an approximate model to the environment. It has been shown that such a model-based learner is efficient if the model learner is efficient in the so-called “knows what it knows” (KWIK) framework. A major limitation of the standard KWIK framework is that, by its very definition, it covers only the case when the (model) learner can represent the actual environment with no errors. In this paper, we study the agnostic KWIK learning model, where we relax this assumption by allowing nonzero approximation errors. We show that with the new definition an efficient model learner still leads to an efficient reinforcement learning algorithm. At the same time, though, we find that learning within the new framework can be substantially slower as compared to the standard framework, even in the case of simple learning problems.

Cite this Paper

BibTeX


@InProceedings{pmlr-v19-szita11a,
  title = 	 {Agnostic KWIK learning and efficient approximate reinforcement learning},
  author = 	 {Szita, István and Szepesvári, Csaba},
  booktitle = 	 {Proceedings of the 24th Annual Conference on Learning Theory},
  pages = 	 {739--772},
  year = 	 {2011},
  editor = 	 {Kakade, Sham M. and von Luxburg, Ulrike},
  volume = 	 {19},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Budapest, Hungary},
  month = 	 {09--11 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v19/szita11a/szita11a.pdf},
  url = 	 {https://proceedings.mlr.press/v19/szita11a.html},
  abstract = 	 {A popular approach in reinforcement learning is to use a model-based algorithm, i.e., an algorithm that utilizes a model learner to learn an approximate model to the  environment. It has been shown that  such a model-based learner is efficient  if the model learner is efficient  in the so-called “knows what it knows”  (KWIK)  framework. A major limitation of the standard KWIK framework is that, by its very definition, it covers only the case when the (model) learner can represent  the actual environment with no errors. In this paper, we study the agnostic KWIK learning model, where we relax this assumption by allowing nonzero approximation errors. We show that with the new definition  an efficient model learner still leads to an efficient reinforcement learning algorithm. At the same time, though, we find that learning within the new framework can be substantially slower as compared to the standard framework, even in the case of simple learning problems.}
}

Endnote

%0 Conference Paper
%T Agnostic KWIK learning and efficient approximate reinforcement learning
%A István Szita
%A Csaba Szepesvári
%B Proceedings of the 24th Annual Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2011
%E Sham M. Kakade
%E Ulrike von Luxburg	
%F pmlr-v19-szita11a
%I PMLR
%P 739--772
%U https://proceedings.mlr.press/v19/szita11a.html
%V 19
%X A popular approach in reinforcement learning is to use a model-based algorithm, i.e., an algorithm that utilizes a model learner to learn an approximate model to the  environment. It has been shown that  such a model-based learner is efficient  if the model learner is efficient  in the so-called “knows what it knows”  (KWIK)  framework. A major limitation of the standard KWIK framework is that, by its very definition, it covers only the case when the (model) learner can represent  the actual environment with no errors. In this paper, we study the agnostic KWIK learning model, where we relax this assumption by allowing nonzero approximation errors. We show that with the new definition  an efficient model learner still leads to an efficient reinforcement learning algorithm. At the same time, though, we find that learning within the new framework can be substantially slower as compared to the standard framework, even in the case of simple learning problems.

RIS


TY  - CPAPER
TI  - Agnostic KWIK learning and efficient approximate reinforcement learning
AU  - István Szita
AU  - Csaba Szepesvári
BT  - Proceedings of the 24th Annual Conference on Learning Theory
DA  - 2011/12/21
ED  - Sham M. Kakade
ED  - Ulrike von Luxburg	
ID  - pmlr-v19-szita11a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 19
SP  - 739
EP  - 772
L1  - http://proceedings.mlr.press/v19/szita11a/szita11a.pdf
UR  - https://proceedings.mlr.press/v19/szita11a.html
AB  - A popular approach in reinforcement learning is to use a model-based algorithm, i.e., an algorithm that utilizes a model learner to learn an approximate model to the  environment. It has been shown that  such a model-based learner is efficient  if the model learner is efficient  in the so-called “knows what it knows”  (KWIK)  framework. A major limitation of the standard KWIK framework is that, by its very definition, it covers only the case when the (model) learner can represent  the actual environment with no errors. In this paper, we study the agnostic KWIK learning model, where we relax this assumption by allowing nonzero approximation errors. We show that with the new definition  an efficient model learner still leads to an efficient reinforcement learning algorithm. At the same time, though, we find that learning within the new framework can be substantially slower as compared to the standard framework, even in the case of simple learning problems.
ER  -

APA


Szita, I. & Szepesvári, C.. (2011). Agnostic KWIK learning and efficient approximate reinforcement learning. Proceedings of the 24th Annual Conference on Learning Theory, in Proceedings of Machine Learning Research 19:739-772 Available from https://proceedings.mlr.press/v19/szita11a.html.

Related Material

Download PDF