Generative Temporal Models with Spatial Memory for Partially Observed Environments

Marco Fraccaro, Danilo Rezende, Yori Zwols, Alexander Pritzel, S. M. Ali Eslami, Fabio Viola
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:1549-1558, 2018.

Abstract

In model-based reinforcement learning, generative and temporal models of environments can be leveraged to boost agent performance, either by tuning the agent’s representations during training or via use as part of an explicit planning mechanism. However, their application in practice has been limited to simplistic environments, due to the difficulty of training such models in larger, potentially partially-observed and 3D environments. In this work we introduce a novel action-conditioned generative model of such challenging environments. The model features a non-parametric spatial memory system in which we store learned, disentangled representations of the environment. Low-dimensional spatial updates are computed using a state-space model that makes use of knowledge on the prior dynamics of the moving agent, and high-dimensional visual observations are modelled with a Variational Auto-Encoder. The result is a scalable architecture capable of performing coherent predictions over hundreds of time steps across a range of partially observed 2D and 3D environments.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-fraccaro18a, title = {Generative Temporal Models with Spatial Memory for Partially Observed Environments}, author = {Fraccaro, Marco and Rezende, Danilo and Zwols, Yori and Pritzel, Alexander and Eslami, S. M. Ali and Viola, Fabio}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {1549--1558}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/fraccaro18a/fraccaro18a.pdf}, url = {https://proceedings.mlr.press/v80/fraccaro18a.html}, abstract = {In model-based reinforcement learning, generative and temporal models of environments can be leveraged to boost agent performance, either by tuning the agent’s representations during training or via use as part of an explicit planning mechanism. However, their application in practice has been limited to simplistic environments, due to the difficulty of training such models in larger, potentially partially-observed and 3D environments. In this work we introduce a novel action-conditioned generative model of such challenging environments. The model features a non-parametric spatial memory system in which we store learned, disentangled representations of the environment. Low-dimensional spatial updates are computed using a state-space model that makes use of knowledge on the prior dynamics of the moving agent, and high-dimensional visual observations are modelled with a Variational Auto-Encoder. The result is a scalable architecture capable of performing coherent predictions over hundreds of time steps across a range of partially observed 2D and 3D environments.} }
Endnote
%0 Conference Paper %T Generative Temporal Models with Spatial Memory for Partially Observed Environments %A Marco Fraccaro %A Danilo Rezende %A Yori Zwols %A Alexander Pritzel %A S. M. Ali Eslami %A Fabio Viola %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-fraccaro18a %I PMLR %P 1549--1558 %U https://proceedings.mlr.press/v80/fraccaro18a.html %V 80 %X In model-based reinforcement learning, generative and temporal models of environments can be leveraged to boost agent performance, either by tuning the agent’s representations during training or via use as part of an explicit planning mechanism. However, their application in practice has been limited to simplistic environments, due to the difficulty of training such models in larger, potentially partially-observed and 3D environments. In this work we introduce a novel action-conditioned generative model of such challenging environments. The model features a non-parametric spatial memory system in which we store learned, disentangled representations of the environment. Low-dimensional spatial updates are computed using a state-space model that makes use of knowledge on the prior dynamics of the moving agent, and high-dimensional visual observations are modelled with a Variational Auto-Encoder. The result is a scalable architecture capable of performing coherent predictions over hundreds of time steps across a range of partially observed 2D and 3D environments.
APA
Fraccaro, M., Rezende, D., Zwols, Y., Pritzel, A., Eslami, S.M.A. & Viola, F.. (2018). Generative Temporal Models with Spatial Memory for Partially Observed Environments. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:1549-1558 Available from https://proceedings.mlr.press/v80/fraccaro18a.html.

Related Material