Composable Planning with Attributes

Amy Zhang, Sainbayar Sukhbaatar, Adam Lerer, Arthur Szlam, Rob Fergus
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:5842-5851, 2018.

Abstract

The tasks that an agent will need to solve often are not known during training. However, if the agent knows which properties of the environment are important then, after learning how its actions affect those properties, it may be able to use this knowledge to solve complex tasks without training specifically for them. Towards this end, we consider a setup in which an environment is augmented with a set of user defined attributes that parameterize the features of interest. We propose a method that learns a policy for transitioning between “nearby” sets of attributes, and maintains a graph of possible transitions. Given a task at test time that can be expressed in terms of a target set of attributes, and a current state, our model infers the attributes of the current state and searches over paths through attribute space to get a high level plan, and then uses its low level policy to execute the plan. We show in 3D block stacking, grid-world games, and StarCraft that our model is able to generalize to longer, more complex tasks at test time by composing simpler learned policies.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-zhang18k, title = {Composable Planning with Attributes}, author = {Zhang, Amy and Sukhbaatar, Sainbayar and Lerer, Adam and Szlam, Arthur and Fergus, Rob}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {5842--5851}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/zhang18k/zhang18k.pdf}, url = {http://proceedings.mlr.press/v80/zhang18k.html}, abstract = {The tasks that an agent will need to solve often are not known during training. However, if the agent knows which properties of the environment are important then, after learning how its actions affect those properties, it may be able to use this knowledge to solve complex tasks without training specifically for them. Towards this end, we consider a setup in which an environment is augmented with a set of user defined attributes that parameterize the features of interest. We propose a method that learns a policy for transitioning between “nearby” sets of attributes, and maintains a graph of possible transitions. Given a task at test time that can be expressed in terms of a target set of attributes, and a current state, our model infers the attributes of the current state and searches over paths through attribute space to get a high level plan, and then uses its low level policy to execute the plan. We show in 3D block stacking, grid-world games, and StarCraft that our model is able to generalize to longer, more complex tasks at test time by composing simpler learned policies.} }
Endnote
%0 Conference Paper %T Composable Planning with Attributes %A Amy Zhang %A Sainbayar Sukhbaatar %A Adam Lerer %A Arthur Szlam %A Rob Fergus %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-zhang18k %I PMLR %P 5842--5851 %U http://proceedings.mlr.press/v80/zhang18k.html %V 80 %X The tasks that an agent will need to solve often are not known during training. However, if the agent knows which properties of the environment are important then, after learning how its actions affect those properties, it may be able to use this knowledge to solve complex tasks without training specifically for them. Towards this end, we consider a setup in which an environment is augmented with a set of user defined attributes that parameterize the features of interest. We propose a method that learns a policy for transitioning between “nearby” sets of attributes, and maintains a graph of possible transitions. Given a task at test time that can be expressed in terms of a target set of attributes, and a current state, our model infers the attributes of the current state and searches over paths through attribute space to get a high level plan, and then uses its low level policy to execute the plan. We show in 3D block stacking, grid-world games, and StarCraft that our model is able to generalize to longer, more complex tasks at test time by composing simpler learned policies.
APA
Zhang, A., Sukhbaatar, S., Lerer, A., Szlam, A. & Fergus, R.. (2018). Composable Planning with Attributes. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:5842-5851 Available from http://proceedings.mlr.press/v80/zhang18k.html.

Related Material