Demonstration-free Autonomous Reinforcement Learning via Implicit and Bidirectional Curriculum

Jigang Kim, Daesol Cho, H. Jin Kim
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:16441-16457, 2023.

Abstract

While reinforcement learning (RL) has achieved great success in acquiring complex skills solely from environmental interactions, it assumes that resets to the initial state are readily available at the end of each episode. Such an assumption hinders the autonomous learning of embodied agents due to the time-consuming and cumbersome workarounds for resetting in the physical world. Hence, there has been a growing interest in autonomous RL (ARL) methods that are capable of learning from non-episodic interactions. However, existing works on ARL are limited by their reliance on prior data and are unable to learn in environments where task-relevant interactions are sparse. In contrast, we propose a demonstration-free ARL algorithm via Implicit and Bi-directional Curriculum (IBC). With an auxiliary agent that is conditionally activated upon learning progress and a bidirectional goal curriculum based on optimal transport, our method outperforms previous methods, even the ones that leverage demonstrations.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-kim23d, title = {Demonstration-free Autonomous Reinforcement Learning via Implicit and Bidirectional Curriculum}, author = {Kim, Jigang and Cho, Daesol and Kim, H. Jin}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {16441--16457}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/kim23d/kim23d.pdf}, url = {https://proceedings.mlr.press/v202/kim23d.html}, abstract = {While reinforcement learning (RL) has achieved great success in acquiring complex skills solely from environmental interactions, it assumes that resets to the initial state are readily available at the end of each episode. Such an assumption hinders the autonomous learning of embodied agents due to the time-consuming and cumbersome workarounds for resetting in the physical world. Hence, there has been a growing interest in autonomous RL (ARL) methods that are capable of learning from non-episodic interactions. However, existing works on ARL are limited by their reliance on prior data and are unable to learn in environments where task-relevant interactions are sparse. In contrast, we propose a demonstration-free ARL algorithm via Implicit and Bi-directional Curriculum (IBC). With an auxiliary agent that is conditionally activated upon learning progress and a bidirectional goal curriculum based on optimal transport, our method outperforms previous methods, even the ones that leverage demonstrations.} }
Endnote
%0 Conference Paper %T Demonstration-free Autonomous Reinforcement Learning via Implicit and Bidirectional Curriculum %A Jigang Kim %A Daesol Cho %A H. Jin Kim %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-kim23d %I PMLR %P 16441--16457 %U https://proceedings.mlr.press/v202/kim23d.html %V 202 %X While reinforcement learning (RL) has achieved great success in acquiring complex skills solely from environmental interactions, it assumes that resets to the initial state are readily available at the end of each episode. Such an assumption hinders the autonomous learning of embodied agents due to the time-consuming and cumbersome workarounds for resetting in the physical world. Hence, there has been a growing interest in autonomous RL (ARL) methods that are capable of learning from non-episodic interactions. However, existing works on ARL are limited by their reliance on prior data and are unable to learn in environments where task-relevant interactions are sparse. In contrast, we propose a demonstration-free ARL algorithm via Implicit and Bi-directional Curriculum (IBC). With an auxiliary agent that is conditionally activated upon learning progress and a bidirectional goal curriculum based on optimal transport, our method outperforms previous methods, even the ones that leverage demonstrations.
APA
Kim, J., Cho, D. & Kim, H.J.. (2023). Demonstration-free Autonomous Reinforcement Learning via Implicit and Bidirectional Curriculum. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:16441-16457 Available from https://proceedings.mlr.press/v202/kim23d.html.

Related Material