Retrieval-Augmented Decision Transformer: External Memory for In-context RL

Thomas Schmied, Fabian Paischer, Vihang Prakash Patil, Markus Hofmarcher, Razvan Pascanu, Sepp Hochreiter
Proceedings of The 4th Conference on Lifelong Learning Agents, PMLR 330:376-417, 2026.

Abstract

In-context learning (ICL) is the ability of a model to learn a new task by observing a few exemplars within its context. While prevalent in NLP, this capability has recently also been observed in Reinforcement Learning (RL) settings. Prior in-context RL methods, however, require entire episodes in the agent’s context. Given that complex environments typically lead to long episodes with sparse rewards, these methods are constrained to environments with short episodes. To address these challenges, we introduce Retrieval-Augmented Decision Transformer (RA-DT). RA-DT employs an external memory mechanism to store past experiences from which it retrieves only sub-trajectories relevant for the current situation. The retrieval component in RA-DT can be entirely domain-agnostic. We evaluate the capabilities of RA-DT on grid-world environments, robotics simulations, and procedurally-generated video games. On grid-worlds, RA-DT outperforms baselines while using only a fraction of their context length. Furthermore, we illuminate the limitations of current in-context RL methods on complex environments and discuss future directions. To facilitate future research, we release datasets for four of the considered environments

Cite this Paper


BibTeX
@InProceedings{pmlr-v330-schmied26a, title = {Retrieval-Augmented Decision Transformer: External Memory for In-context RL}, author = {Schmied, Thomas and Paischer, Fabian and Patil, Vihang Prakash and Hofmarcher, Markus and Pascanu, Razvan and Hochreiter, Sepp}, booktitle = {Proceedings of The 4th Conference on Lifelong Learning Agents}, pages = {376--417}, year = {2026}, editor = {Chandar, Sarath and Pascanu, Razvan and Eaton, Eric and Liu, Bing and Mahmood, Rupam and Rannen-Triki, Amal}, volume = {330}, series = {Proceedings of Machine Learning Research}, month = {11--14 Aug}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v330/main/assets/schmied26a/schmied26a.pdf}, url = {https://proceedings.mlr.press/v330/schmied26a.html}, abstract = {In-context learning (ICL) is the ability of a model to learn a new task by observing a few exemplars within its context. While prevalent in NLP, this capability has recently also been observed in Reinforcement Learning (RL) settings. Prior in-context RL methods, however, require entire episodes in the agent’s context. Given that complex environments typically lead to long episodes with sparse rewards, these methods are constrained to environments with short episodes. To address these challenges, we introduce Retrieval-Augmented Decision Transformer (RA-DT). RA-DT employs an external memory mechanism to store past experiences from which it retrieves only sub-trajectories relevant for the current situation. The retrieval component in RA-DT can be entirely domain-agnostic. We evaluate the capabilities of RA-DT on grid-world environments, robotics simulations, and procedurally-generated video games. On grid-worlds, RA-DT outperforms baselines while using only a fraction of their context length. Furthermore, we illuminate the limitations of current in-context RL methods on complex environments and discuss future directions. To facilitate future research, we release datasets for four of the considered environments} }
Endnote
%0 Conference Paper %T Retrieval-Augmented Decision Transformer: External Memory for In-context RL %A Thomas Schmied %A Fabian Paischer %A Vihang Prakash Patil %A Markus Hofmarcher %A Razvan Pascanu %A Sepp Hochreiter %B Proceedings of The 4th Conference on Lifelong Learning Agents %C Proceedings of Machine Learning Research %D 2026 %E Sarath Chandar %E Razvan Pascanu %E Eric Eaton %E Bing Liu %E Rupam Mahmood %E Amal Rannen-Triki %F pmlr-v330-schmied26a %I PMLR %P 376--417 %U https://proceedings.mlr.press/v330/schmied26a.html %V 330 %X In-context learning (ICL) is the ability of a model to learn a new task by observing a few exemplars within its context. While prevalent in NLP, this capability has recently also been observed in Reinforcement Learning (RL) settings. Prior in-context RL methods, however, require entire episodes in the agent’s context. Given that complex environments typically lead to long episodes with sparse rewards, these methods are constrained to environments with short episodes. To address these challenges, we introduce Retrieval-Augmented Decision Transformer (RA-DT). RA-DT employs an external memory mechanism to store past experiences from which it retrieves only sub-trajectories relevant for the current situation. The retrieval component in RA-DT can be entirely domain-agnostic. We evaluate the capabilities of RA-DT on grid-world environments, robotics simulations, and procedurally-generated video games. On grid-worlds, RA-DT outperforms baselines while using only a fraction of their context length. Furthermore, we illuminate the limitations of current in-context RL methods on complex environments and discuss future directions. To facilitate future research, we release datasets for four of the considered environments
APA
Schmied, T., Paischer, F., Patil, V.P., Hofmarcher, M., Pascanu, R. & Hochreiter, S.. (2026). Retrieval-Augmented Decision Transformer: External Memory for In-context RL. Proceedings of The 4th Conference on Lifelong Learning Agents, in Proceedings of Machine Learning Research 330:376-417 Available from https://proceedings.mlr.press/v330/schmied26a.html.

Related Material