Managing Uncertainty within the KTD Framework

Matthieu Geist; Olivier Pietquin

Managing Uncertainty within the KTD Framework

Matthieu Geist, Olivier Pietquin

Active Learning and Experimental Design workshop In conjunction with AISTATS 2010, PMLR 16:157-168, 2011.

Abstract

The dilemma between exploration and exploitation is an important topic in reinforcement learning (RL). Most successful approaches in addressing this problem tend to use some uncertainty information about values estimated during learning. On another hand, scalability is known as being a lack of RL algorithms and value function approximation has become a major topic of research. Both problems arise in real-world applications, however few approaches allow approximating the value function while maintaining uncertainty information about estimates. Even fewer use this information in the purpose of addressing the exploration/exploitation dilemma. In this paper, we show how such an uncertainty information can be derived from a Kalman-based Temporal Differences (KTD) framework and how it can be used.

Cite this Paper

BibTeX


@InProceedings{pmlr-v16-geist11a,
  title = 	 {Managing Uncertainty within the KTD Framework},
  author = 	 {Geist, Matthieu and Pietquin, Olivier},
  booktitle = 	 {Active Learning and Experimental Design workshop In conjunction with AISTATS 2010},
  pages = 	 {157--168},
  year = 	 {2011},
  editor = 	 {Guyon, Isabelle and Cawley, Gavin and Dror, Gideon and Lemaire, Vincent and Statnikov, Alexander},
  volume = 	 {16},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Sardinia, Italy},
  month = 	 {16 May},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v16/geist11a/geist11a.pdf},
  url = 	 {https://proceedings.mlr.press/v16/geist11a.html},
  abstract = 	 {The dilemma between exploration and exploitation is an important topic in reinforcement learning (RL). Most successful approaches in addressing this problem tend to use some uncertainty information about values estimated during learning. On another hand, scalability is known as being a lack of RL algorithms and value function approximation has become a major topic of research. Both problems arise in real-world applications, however few approaches allow approximating the value function while maintaining uncertainty information about estimates. Even fewer use this information in the purpose of addressing the exploration/exploitation dilemma. In this paper, we show how such an uncertainty information can be derived from a Kalman-based Temporal Differences (KTD) framework and how it can be used.}
}

Endnote

%0 Conference Paper
%T Managing Uncertainty within the KTD Framework
%A Matthieu Geist
%A Olivier Pietquin
%B Active Learning and Experimental Design workshop In conjunction with AISTATS 2010
%C Proceedings of Machine Learning Research
%D 2011
%E Isabelle Guyon
%E Gavin Cawley
%E Gideon Dror
%E Vincent Lemaire
%E Alexander Statnikov	
%F pmlr-v16-geist11a
%I PMLR
%P 157--168
%U https://proceedings.mlr.press/v16/geist11a.html
%V 16
%X The dilemma between exploration and exploitation is an important topic in reinforcement learning (RL). Most successful approaches in addressing this problem tend to use some uncertainty information about values estimated during learning. On another hand, scalability is known as being a lack of RL algorithms and value function approximation has become a major topic of research. Both problems arise in real-world applications, however few approaches allow approximating the value function while maintaining uncertainty information about estimates. Even fewer use this information in the purpose of addressing the exploration/exploitation dilemma. In this paper, we show how such an uncertainty information can be derived from a Kalman-based Temporal Differences (KTD) framework and how it can be used.

RIS


TY  - CPAPER
TI  - Managing Uncertainty within the KTD Framework
AU  - Matthieu Geist
AU  - Olivier Pietquin
BT  - Active Learning and Experimental Design workshop In conjunction with AISTATS 2010
DA  - 2011/04/21
ED  - Isabelle Guyon
ED  - Gavin Cawley
ED  - Gideon Dror
ED  - Vincent Lemaire
ED  - Alexander Statnikov	
ID  - pmlr-v16-geist11a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 16
SP  - 157
EP  - 168
L1  - http://proceedings.mlr.press/v16/geist11a/geist11a.pdf
UR  - https://proceedings.mlr.press/v16/geist11a.html
AB  - The dilemma between exploration and exploitation is an important topic in reinforcement learning (RL). Most successful approaches in addressing this problem tend to use some uncertainty information about values estimated during learning. On another hand, scalability is known as being a lack of RL algorithms and value function approximation has become a major topic of research. Both problems arise in real-world applications, however few approaches allow approximating the value function while maintaining uncertainty information about estimates. Even fewer use this information in the purpose of addressing the exploration/exploitation dilemma. In this paper, we show how such an uncertainty information can be derived from a Kalman-based Temporal Differences (KTD) framework and how it can be used.
ER  -

APA


Geist, M. & Pietquin, O.. (2011). Managing Uncertainty within the KTD Framework. Active Learning and Experimental Design workshop In conjunction with AISTATS 2010, in Proceedings of Machine Learning Research 16:157-168 Available from https://proceedings.mlr.press/v16/geist11a.html.

Related Material

Download PDF