Hierarchy Through Composition with Multitask LMDPs

Andrew M. Saxe, Adam C. Earle, Benjamin Rosman
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:3017-3026, 2017.

Abstract

Hierarchical architectures are critical to the scalability of reinforcement learning methods. Most current hierarchical frameworks execute actions serially, with macro-actions comprising sequences of primitive actions. We propose a novel alternative to these control hierarchies based on concurrent execution of many actions in parallel. Our scheme exploits the guaranteed concurrent compositionality provided by the linearly solvable Markov decision process (LMDP) framework, which naturally enables a learning agent to draw on several macro-actions simultaneously to solve new tasks. We introduce the Multitask LMDP module, which maintains a parallel distributed representation of tasks and may be stacked to form deep hierarchies abstracted in space and time.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-saxe17a, title = {Hierarchy Through Composition with Multitask {LMDP}s}, author = {Andrew M. Saxe and Adam C. Earle and Benjamin Rosman}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {3017--3026}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/saxe17a/saxe17a.pdf}, url = {https://proceedings.mlr.press/v70/saxe17a.html}, abstract = {Hierarchical architectures are critical to the scalability of reinforcement learning methods. Most current hierarchical frameworks execute actions serially, with macro-actions comprising sequences of primitive actions. We propose a novel alternative to these control hierarchies based on concurrent execution of many actions in parallel. Our scheme exploits the guaranteed concurrent compositionality provided by the linearly solvable Markov decision process (LMDP) framework, which naturally enables a learning agent to draw on several macro-actions simultaneously to solve new tasks. We introduce the Multitask LMDP module, which maintains a parallel distributed representation of tasks and may be stacked to form deep hierarchies abstracted in space and time.} }
Endnote
%0 Conference Paper %T Hierarchy Through Composition with Multitask LMDPs %A Andrew M. Saxe %A Adam C. Earle %A Benjamin Rosman %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-saxe17a %I PMLR %P 3017--3026 %U https://proceedings.mlr.press/v70/saxe17a.html %V 70 %X Hierarchical architectures are critical to the scalability of reinforcement learning methods. Most current hierarchical frameworks execute actions serially, with macro-actions comprising sequences of primitive actions. We propose a novel alternative to these control hierarchies based on concurrent execution of many actions in parallel. Our scheme exploits the guaranteed concurrent compositionality provided by the linearly solvable Markov decision process (LMDP) framework, which naturally enables a learning agent to draw on several macro-actions simultaneously to solve new tasks. We introduce the Multitask LMDP module, which maintains a parallel distributed representation of tasks and may be stacked to form deep hierarchies abstracted in space and time.
APA
Saxe, A.M., Earle, A.C. & Rosman, B.. (2017). Hierarchy Through Composition with Multitask LMDPs. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:3017-3026 Available from https://proceedings.mlr.press/v70/saxe17a.html.

Related Material