Efficient World Models with Context-Aware Tokenization

Vincent Micheli, Eloi Alonso, François Fleuret
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:35623-35638, 2024.

Abstract

Scaling up deep Reinforcement Learning (RL) methods presents a significant challenge. Following developments in generative modelling, model-based RL positions itself as a strong contender. Recent advances in sequence modelling have led to effective transformer-based world models, albeit at the price of heavy computations due to the long sequences of tokens required to accurately simulate environments. In this work, we propose $\Delta$-IRIS, a new agent with a world model architecture composed of a discrete autoencoder that encodes stochastic deltas between time steps and an autoregressive transformer that predicts future deltas by summarizing the current state of the world with continuous tokens. In the Crafter benchmark, $\Delta$-IRIS sets a new state of the art at multiple frame budgets, while being an order of magnitude faster to train than previous attention-based approaches. We release our code and models at https://github.com/vmicheli/delta-iris.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-micheli24a, title = {Efficient World Models with Context-Aware Tokenization}, author = {Micheli, Vincent and Alonso, Eloi and Fleuret, Fran\c{c}ois}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {35623--35638}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/micheli24a/micheli24a.pdf}, url = {https://proceedings.mlr.press/v235/micheli24a.html}, abstract = {Scaling up deep Reinforcement Learning (RL) methods presents a significant challenge. Following developments in generative modelling, model-based RL positions itself as a strong contender. Recent advances in sequence modelling have led to effective transformer-based world models, albeit at the price of heavy computations due to the long sequences of tokens required to accurately simulate environments. In this work, we propose $\Delta$-IRIS, a new agent with a world model architecture composed of a discrete autoencoder that encodes stochastic deltas between time steps and an autoregressive transformer that predicts future deltas by summarizing the current state of the world with continuous tokens. In the Crafter benchmark, $\Delta$-IRIS sets a new state of the art at multiple frame budgets, while being an order of magnitude faster to train than previous attention-based approaches. We release our code and models at https://github.com/vmicheli/delta-iris.} }
Endnote
%0 Conference Paper %T Efficient World Models with Context-Aware Tokenization %A Vincent Micheli %A Eloi Alonso %A François Fleuret %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-micheli24a %I PMLR %P 35623--35638 %U https://proceedings.mlr.press/v235/micheli24a.html %V 235 %X Scaling up deep Reinforcement Learning (RL) methods presents a significant challenge. Following developments in generative modelling, model-based RL positions itself as a strong contender. Recent advances in sequence modelling have led to effective transformer-based world models, albeit at the price of heavy computations due to the long sequences of tokens required to accurately simulate environments. In this work, we propose $\Delta$-IRIS, a new agent with a world model architecture composed of a discrete autoencoder that encodes stochastic deltas between time steps and an autoregressive transformer that predicts future deltas by summarizing the current state of the world with continuous tokens. In the Crafter benchmark, $\Delta$-IRIS sets a new state of the art at multiple frame budgets, while being an order of magnitude faster to train than previous attention-based approaches. We release our code and models at https://github.com/vmicheli/delta-iris.
APA
Micheli, V., Alonso, E. & Fleuret, F.. (2024). Efficient World Models with Context-Aware Tokenization. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:35623-35638 Available from https://proceedings.mlr.press/v235/micheli24a.html.

Related Material