Managing Uncertainty within the KTD Framework

Matthieu Geist, Olivier Pietquin
Active Learning and Experimental Design workshop In conjunction with AISTATS 2010, PMLR 16:157-168, 2011.

Abstract

The dilemma between exploration and exploitation is an important topic in reinforcement learning (RL). Most successful approaches in addressing this problem tend to use some uncertainty information about values estimated during learning. On another hand, scalability is known as being a lack of RL algorithms and value function approximation has become a major topic of research. Both problems arise in real-world applications, however few approaches allow approximating the value function while maintaining uncertainty information about estimates. Even fewer use this information in the purpose of addressing the exploration/exploitation dilemma. In this paper, we show how such an uncertainty information can be derived from a Kalman-based Temporal Differences (KTD) framework and how it can be used.

Cite this Paper


BibTeX
@InProceedings{pmlr-v16-geist11a, title = {Managing Uncertainty within the KTD Framework}, author = {Geist, Matthieu and Pietquin, Olivier}, booktitle = {Active Learning and Experimental Design workshop In conjunction with AISTATS 2010}, pages = {157--168}, year = {2011}, editor = {Guyon, Isabelle and Cawley, Gavin and Dror, Gideon and Lemaire, Vincent and Statnikov, Alexander}, volume = {16}, series = {Proceedings of Machine Learning Research}, address = {Sardinia, Italy}, month = {16 May}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v16/geist11a/geist11a.pdf}, url = {https://proceedings.mlr.press/v16/geist11a.html}, abstract = {The dilemma between exploration and exploitation is an important topic in reinforcement learning (RL). Most successful approaches in addressing this problem tend to use some uncertainty information about values estimated during learning. On another hand, scalability is known as being a lack of RL algorithms and value function approximation has become a major topic of research. Both problems arise in real-world applications, however few approaches allow approximating the value function while maintaining uncertainty information about estimates. Even fewer use this information in the purpose of addressing the exploration/exploitation dilemma. In this paper, we show how such an uncertainty information can be derived from a Kalman-based Temporal Differences (KTD) framework and how it can be used.} }
Endnote
%0 Conference Paper %T Managing Uncertainty within the KTD Framework %A Matthieu Geist %A Olivier Pietquin %B Active Learning and Experimental Design workshop In conjunction with AISTATS 2010 %C Proceedings of Machine Learning Research %D 2011 %E Isabelle Guyon %E Gavin Cawley %E Gideon Dror %E Vincent Lemaire %E Alexander Statnikov %F pmlr-v16-geist11a %I PMLR %P 157--168 %U https://proceedings.mlr.press/v16/geist11a.html %V 16 %X The dilemma between exploration and exploitation is an important topic in reinforcement learning (RL). Most successful approaches in addressing this problem tend to use some uncertainty information about values estimated during learning. On another hand, scalability is known as being a lack of RL algorithms and value function approximation has become a major topic of research. Both problems arise in real-world applications, however few approaches allow approximating the value function while maintaining uncertainty information about estimates. Even fewer use this information in the purpose of addressing the exploration/exploitation dilemma. In this paper, we show how such an uncertainty information can be derived from a Kalman-based Temporal Differences (KTD) framework and how it can be used.
RIS
TY - CPAPER TI - Managing Uncertainty within the KTD Framework AU - Matthieu Geist AU - Olivier Pietquin BT - Active Learning and Experimental Design workshop In conjunction with AISTATS 2010 DA - 2011/04/21 ED - Isabelle Guyon ED - Gavin Cawley ED - Gideon Dror ED - Vincent Lemaire ED - Alexander Statnikov ID - pmlr-v16-geist11a PB - PMLR DP - Proceedings of Machine Learning Research VL - 16 SP - 157 EP - 168 L1 - http://proceedings.mlr.press/v16/geist11a/geist11a.pdf UR - https://proceedings.mlr.press/v16/geist11a.html AB - The dilemma between exploration and exploitation is an important topic in reinforcement learning (RL). Most successful approaches in addressing this problem tend to use some uncertainty information about values estimated during learning. On another hand, scalability is known as being a lack of RL algorithms and value function approximation has become a major topic of research. Both problems arise in real-world applications, however few approaches allow approximating the value function while maintaining uncertainty information about estimates. Even fewer use this information in the purpose of addressing the exploration/exploitation dilemma. In this paper, we show how such an uncertainty information can be derived from a Kalman-based Temporal Differences (KTD) framework and how it can be used. ER -
APA
Geist, M. & Pietquin, O.. (2011). Managing Uncertainty within the KTD Framework. Active Learning and Experimental Design workshop In conjunction with AISTATS 2010, in Proceedings of Machine Learning Research 16:157-168 Available from https://proceedings.mlr.press/v16/geist11a.html.

Related Material