Expanding Motor Skills using Relay Networks

Visak CV Kumar, Sehoon Ha, C.Karen Liu
Proceedings of The 2nd Conference on Robot Learning, PMLR 87:744-756, 2018.

Abstract

While recent advances in deep reinforcement learning have achieved impressive results in learning motor skills, many policies are only capable within a limited set of initial states. We propose an algorithm that sequentially decomposes a complex robotic task into simpler subtasks and trains a local policy for each subtask such that the robot can expand its existing skill set gradually. Our key idea is to build a directed graph of local control policies represented by neural networks, which we refer to as relay neural networks. Starting from the first policy that attempts to achieve the task from a small set of initial states, the algorithm iteratively discovers the next subtask with increasingly more difficult initial states until the last subtask matches the initial state distribution of the original task. The policy of each subtask aims to drive the robot to a state where the policy of its preceding subtask is able to handle. By taking advantage of many existing actor-critic style policy search algorithms, we utilize the optimized value function to define “good states” for the next policy to relay to.

Cite this Paper


BibTeX
@InProceedings{pmlr-v87-kumar18a, title = {Expanding Motor Skills using Relay Networks}, author = {Kumar, Visak CV and Ha, Sehoon and Liu, C.Karen}, booktitle = {Proceedings of The 2nd Conference on Robot Learning}, pages = {744--756}, year = {2018}, editor = {Billard, Aude and Dragan, Anca and Peters, Jan and Morimoto, Jun}, volume = {87}, series = {Proceedings of Machine Learning Research}, month = {29--31 Oct}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v87/kumar18a/kumar18a.pdf}, url = {https://proceedings.mlr.press/v87/kumar18a.html}, abstract = {While recent advances in deep reinforcement learning have achieved impressive results in learning motor skills, many policies are only capable within a limited set of initial states. We propose an algorithm that sequentially decomposes a complex robotic task into simpler subtasks and trains a local policy for each subtask such that the robot can expand its existing skill set gradually. Our key idea is to build a directed graph of local control policies represented by neural networks, which we refer to as relay neural networks. Starting from the first policy that attempts to achieve the task from a small set of initial states, the algorithm iteratively discovers the next subtask with increasingly more difficult initial states until the last subtask matches the initial state distribution of the original task. The policy of each subtask aims to drive the robot to a state where the policy of its preceding subtask is able to handle. By taking advantage of many existing actor-critic style policy search algorithms, we utilize the optimized value function to define “good states” for the next policy to relay to. } }
Endnote
%0 Conference Paper %T Expanding Motor Skills using Relay Networks %A Visak CV Kumar %A Sehoon Ha %A C.Karen Liu %B Proceedings of The 2nd Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2018 %E Aude Billard %E Anca Dragan %E Jan Peters %E Jun Morimoto %F pmlr-v87-kumar18a %I PMLR %P 744--756 %U https://proceedings.mlr.press/v87/kumar18a.html %V 87 %X While recent advances in deep reinforcement learning have achieved impressive results in learning motor skills, many policies are only capable within a limited set of initial states. We propose an algorithm that sequentially decomposes a complex robotic task into simpler subtasks and trains a local policy for each subtask such that the robot can expand its existing skill set gradually. Our key idea is to build a directed graph of local control policies represented by neural networks, which we refer to as relay neural networks. Starting from the first policy that attempts to achieve the task from a small set of initial states, the algorithm iteratively discovers the next subtask with increasingly more difficult initial states until the last subtask matches the initial state distribution of the original task. The policy of each subtask aims to drive the robot to a state where the policy of its preceding subtask is able to handle. By taking advantage of many existing actor-critic style policy search algorithms, we utilize the optimized value function to define “good states” for the next policy to relay to.
APA
Kumar, V.C., Ha, S. & Liu, C.. (2018). Expanding Motor Skills using Relay Networks. Proceedings of The 2nd Conference on Robot Learning, in Proceedings of Machine Learning Research 87:744-756 Available from https://proceedings.mlr.press/v87/kumar18a.html.

Related Material