Unsupervised Skill Discovery as Exploration for Learning Agile Locomotion

Seungeun Rho, Kartik Garg, Morgan Byrd, Sehoon Ha
Proceedings of The 9th Conference on Robot Learning, PMLR 305:2678-2694, 2025.

Abstract

Exploration is crucial for legged robots to learn agile locomotion behaviors capable of overcoming diverse obstacles. For example, a robot may need to try different contact patterns and momentum profiles to successfully jump over an obstacle—but encouraging such diverse exploration is inherently challenging. As a result, training these behaviors often relies on additional techniques such as extensive reward engineering, expert demonstrations, or curriculum learning. However, these approaches limit generalizability, especially when prior knowledge or demonstration data is unavailable. In this work, we propose using unsupervised skill discovery as a skill-level exploration strategy to significantly reduce human engineering effort. Our learning framework enables the agent to autonomously discover diverse skills to overcome complex obstacles. To dynamically regulate the degree of exploration throughout training, we introduce a bi-level optimization process that learns a parameter to balance two distinct reward signals. We demonstrate that our method enables quadrupedal robots to acquire highly agile behaviors—including crawling, climbing, leaping, and complex maneuvers such as jumping off vertical walls. Finally, we successfully deploy the learned policy on real hardware, validating its transferability to the real world.

Cite this Paper


BibTeX
@InProceedings{pmlr-v305-rho25a, title = {Unsupervised Skill Discovery as Exploration for Learning Agile Locomotion}, author = {Rho, Seungeun and Garg, Kartik and Byrd, Morgan and Ha, Sehoon}, booktitle = {Proceedings of The 9th Conference on Robot Learning}, pages = {2678--2694}, year = {2025}, editor = {Lim, Joseph and Song, Shuran and Park, Hae-Won}, volume = {305}, series = {Proceedings of Machine Learning Research}, month = {27--30 Sep}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v305/main/assets/rho25a/rho25a.pdf}, url = {https://proceedings.mlr.press/v305/rho25a.html}, abstract = {Exploration is crucial for legged robots to learn agile locomotion behaviors capable of overcoming diverse obstacles. For example, a robot may need to try different contact patterns and momentum profiles to successfully jump over an obstacle—but encouraging such diverse exploration is inherently challenging. As a result, training these behaviors often relies on additional techniques such as extensive reward engineering, expert demonstrations, or curriculum learning. However, these approaches limit generalizability, especially when prior knowledge or demonstration data is unavailable. In this work, we propose using unsupervised skill discovery as a skill-level exploration strategy to significantly reduce human engineering effort. Our learning framework enables the agent to autonomously discover diverse skills to overcome complex obstacles. To dynamically regulate the degree of exploration throughout training, we introduce a bi-level optimization process that learns a parameter to balance two distinct reward signals. We demonstrate that our method enables quadrupedal robots to acquire highly agile behaviors—including crawling, climbing, leaping, and complex maneuvers such as jumping off vertical walls. Finally, we successfully deploy the learned policy on real hardware, validating its transferability to the real world.} }
Endnote
%0 Conference Paper %T Unsupervised Skill Discovery as Exploration for Learning Agile Locomotion %A Seungeun Rho %A Kartik Garg %A Morgan Byrd %A Sehoon Ha %B Proceedings of The 9th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Joseph Lim %E Shuran Song %E Hae-Won Park %F pmlr-v305-rho25a %I PMLR %P 2678--2694 %U https://proceedings.mlr.press/v305/rho25a.html %V 305 %X Exploration is crucial for legged robots to learn agile locomotion behaviors capable of overcoming diverse obstacles. For example, a robot may need to try different contact patterns and momentum profiles to successfully jump over an obstacle—but encouraging such diverse exploration is inherently challenging. As a result, training these behaviors often relies on additional techniques such as extensive reward engineering, expert demonstrations, or curriculum learning. However, these approaches limit generalizability, especially when prior knowledge or demonstration data is unavailable. In this work, we propose using unsupervised skill discovery as a skill-level exploration strategy to significantly reduce human engineering effort. Our learning framework enables the agent to autonomously discover diverse skills to overcome complex obstacles. To dynamically regulate the degree of exploration throughout training, we introduce a bi-level optimization process that learns a parameter to balance two distinct reward signals. We demonstrate that our method enables quadrupedal robots to acquire highly agile behaviors—including crawling, climbing, leaping, and complex maneuvers such as jumping off vertical walls. Finally, we successfully deploy the learned policy on real hardware, validating its transferability to the real world.
APA
Rho, S., Garg, K., Byrd, M. & Ha, S.. (2025). Unsupervised Skill Discovery as Exploration for Learning Agile Locomotion. Proceedings of The 9th Conference on Robot Learning, in Proceedings of Machine Learning Research 305:2678-2694 Available from https://proceedings.mlr.press/v305/rho25a.html.

Related Material