Feature Reinforcement Learning using Looping Suffix Trees

Mayank Daswani; Peter Sunehag; Marcus Hutter

Feature Reinforcement Learning using Looping Suffix Trees

Mayank Daswani, Peter Sunehag, Marcus Hutter

Proceedings of the Tenth European Workshop on Reinforcement Learning, PMLR 24:11-24, 2013.

Abstract

There has recently been much interest in history-based methods using suffix trees to solve POMDPs. However, these suffix trees cannot efficiently represent environments that have long-term dependencies. We extend the recently introduced CTÎ¦MDP algorithm to the space of looping suffix trees which have previously only been used in solving deterministic POMDPs. The resulting algorithm replicates results from CTÎ¦MDP for environments with short term dependencies, while it outperforms LSTM-based methods on TMaze, a deep memory environment.

Cite this Paper

BibTeX


@InProceedings{pmlr-v24-daswani12a,
  title = 	 {Feature Reinforcement Learning using Looping Suffix Trees},
  author = 	 {Daswani, Mayank and Sunehag, Peter and Hutter, Marcus},
  booktitle = 	 {Proceedings of the Tenth European Workshop on Reinforcement Learning},
  pages = 	 {11--24},
  year = 	 {2013},
  editor = 	 {Deisenroth, Marc Peter and Szepesvári, Csaba and Peters, Jan},
  volume = 	 {24},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Edinburgh, Scotland},
  month = 	 {30 Jun--01 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v24/daswani12a/daswani12a.pdf},
  url = 	 {https://proceedings.mlr.press/v24/daswani12a.html},
  abstract = 	 {There has recently been much interest in history-based methods using suffix trees to solve POMDPs. However, these suffix trees cannot efficiently represent environments that have long-term dependencies. We extend the recently introduced CTÎ¦MDP algorithm to the space of looping suffix trees which have previously only been used in solving deterministic POMDPs. The resulting algorithm replicates results from CTÎ¦MDP for environments with short term dependencies, while it outperforms LSTM-based methods on TMaze, a deep memory environment.}
}

Endnote

%0 Conference Paper
%T Feature Reinforcement Learning using Looping Suffix Trees
%A Mayank Daswani
%A Peter Sunehag
%A Marcus Hutter
%B Proceedings of the Tenth European Workshop on Reinforcement Learning
%C Proceedings of Machine Learning Research
%D 2013
%E Marc Peter Deisenroth
%E Csaba Szepesvári
%E Jan Peters	
%F pmlr-v24-daswani12a
%I PMLR
%P 11--24
%U https://proceedings.mlr.press/v24/daswani12a.html
%V 24
%X There has recently been much interest in history-based methods using suffix trees to solve POMDPs. However, these suffix trees cannot efficiently represent environments that have long-term dependencies. We extend the recently introduced CTÎ¦MDP algorithm to the space of looping suffix trees which have previously only been used in solving deterministic POMDPs. The resulting algorithm replicates results from CTÎ¦MDP for environments with short term dependencies, while it outperforms LSTM-based methods on TMaze, a deep memory environment.

RIS


TY  - CPAPER
TI  - Feature Reinforcement Learning using Looping Suffix Trees
AU  - Mayank Daswani
AU  - Peter Sunehag
AU  - Marcus Hutter
BT  - Proceedings of the Tenth European Workshop on Reinforcement Learning
DA  - 2013/01/12
ED  - Marc Peter Deisenroth
ED  - Csaba Szepesvári
ED  - Jan Peters	
ID  - pmlr-v24-daswani12a
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 24
SP  - 11
EP  - 24
L1  - http://proceedings.mlr.press/v24/daswani12a/daswani12a.pdf
UR  - https://proceedings.mlr.press/v24/daswani12a.html
AB  - There has recently been much interest in history-based methods using suffix trees to solve POMDPs. However, these suffix trees cannot efficiently represent environments that have long-term dependencies. We extend the recently introduced CTÎ¦MDP algorithm to the space of looping suffix trees which have previously only been used in solving deterministic POMDPs. The resulting algorithm replicates results from CTÎ¦MDP for environments with short term dependencies, while it outperforms LSTM-based methods on TMaze, a deep memory environment.
ER  -

APA


Daswani, M., Sunehag, P. & Hutter, M.. (2013). Feature Reinforcement Learning using Looping Suffix Trees. Proceedings of the Tenth European Workshop on Reinforcement Learning, in Proceedings of Machine Learning Research 24:11-24 Available from https://proceedings.mlr.press/v24/daswani12a.html.

Related Material

Download PDF