Think Before You Act: Decision Transformers with Working Memory

Jikun Kang, Romain Laroche, Xingdi Yuan, Adam Trischler, Xue Liu, Jie Fu
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:23001-23021, 2024.

Abstract

Decision Transformer-based decision-making agents have shown the ability to generalize across multiple tasks. However, their performance relies on massive data and computation. We argue that this inefficiency stems from the forgetting phenomenon, in which a model memorizes its behaviors in parameters throughout training. As a result, training on a new task may deteriorate the model’s performance on previous tasks. In contrast to LLMs’ implicit memory mechanism, the human brain utilizes distributed memory storage, which helps manage and organize multiple skills efficiently, mitigating the forgetting phenomenon. Inspired by this, we propose a working memory module to store, blend, and retrieve information for different downstream tasks. Evaluation results show that the proposed method improves training efficiency and generalization in Atari games and Meta-World object manipulation tasks. Moreover, we demonstrate that memory fine-tuning further enhances the adaptability of the proposed architecture.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-kang24b, title = {Think Before You Act: Decision Transformers with Working Memory}, author = {Kang, Jikun and Laroche, Romain and Yuan, Xingdi and Trischler, Adam and Liu, Xue and Fu, Jie}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {23001--23021}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/kang24b/kang24b.pdf}, url = {https://proceedings.mlr.press/v235/kang24b.html}, abstract = {Decision Transformer-based decision-making agents have shown the ability to generalize across multiple tasks. However, their performance relies on massive data and computation. We argue that this inefficiency stems from the forgetting phenomenon, in which a model memorizes its behaviors in parameters throughout training. As a result, training on a new task may deteriorate the model’s performance on previous tasks. In contrast to LLMs’ implicit memory mechanism, the human brain utilizes distributed memory storage, which helps manage and organize multiple skills efficiently, mitigating the forgetting phenomenon. Inspired by this, we propose a working memory module to store, blend, and retrieve information for different downstream tasks. Evaluation results show that the proposed method improves training efficiency and generalization in Atari games and Meta-World object manipulation tasks. Moreover, we demonstrate that memory fine-tuning further enhances the adaptability of the proposed architecture.} }
Endnote
%0 Conference Paper %T Think Before You Act: Decision Transformers with Working Memory %A Jikun Kang %A Romain Laroche %A Xingdi Yuan %A Adam Trischler %A Xue Liu %A Jie Fu %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-kang24b %I PMLR %P 23001--23021 %U https://proceedings.mlr.press/v235/kang24b.html %V 235 %X Decision Transformer-based decision-making agents have shown the ability to generalize across multiple tasks. However, their performance relies on massive data and computation. We argue that this inefficiency stems from the forgetting phenomenon, in which a model memorizes its behaviors in parameters throughout training. As a result, training on a new task may deteriorate the model’s performance on previous tasks. In contrast to LLMs’ implicit memory mechanism, the human brain utilizes distributed memory storage, which helps manage and organize multiple skills efficiently, mitigating the forgetting phenomenon. Inspired by this, we propose a working memory module to store, blend, and retrieve information for different downstream tasks. Evaluation results show that the proposed method improves training efficiency and generalization in Atari games and Meta-World object manipulation tasks. Moreover, we demonstrate that memory fine-tuning further enhances the adaptability of the proposed architecture.
APA
Kang, J., Laroche, R., Yuan, X., Trischler, A., Liu, X. & Fu, J.. (2024). Think Before You Act: Decision Transformers with Working Memory. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:23001-23021 Available from https://proceedings.mlr.press/v235/kang24b.html.

Related Material