Environment Curriculum Generation via Large Language Models

William Liang, Sam Wang, Hung-Ju Wang, Osbert Bastani, Dinesh Jayaraman, Yecheng Jason Ma
Proceedings of The 8th Conference on Robot Learning, PMLR 270:433-454, 2025.

Abstract

Recent work has demonstrated that a promising strategy for teaching robots a wide range of complex skills is by training them on a curriculum of progressively more challenging environments. However, developing an effective curriculum of environment distributions currently requires significant expertise, which must be repeated for every new domain. Our key insight is that environments are often naturally represented as code. Thus, we probe whether effective environment curriculum design can be achieved and automated via code generation by large language models (LLM). In this paper, we introduce Eurekaverse, an unsupervised environment design algorithm that uses LLMs to sample progressively more challenging, diverse, and learnable environments for skill training. We validate Eurekaverse’s effectiveness in the domain of quadrupedal parkour learning, in which a quadruped robot must traverse through a variety of obstacle courses. The automatic curriculum designed by Eurekaverse enables gradual learning of complex parkour skills in simulation and can successfully transfer to the real-world, outperforming manual training courses designed by humans.

Cite this Paper


BibTeX
@InProceedings{pmlr-v270-liang25a, title = {Environment Curriculum Generation via Large Language Models}, author = {Liang, William and Wang, Sam and Wang, Hung-Ju and Bastani, Osbert and Jayaraman, Dinesh and Ma, Yecheng Jason}, booktitle = {Proceedings of The 8th Conference on Robot Learning}, pages = {433--454}, year = {2025}, editor = {Agrawal, Pulkit and Kroemer, Oliver and Burgard, Wolfram}, volume = {270}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v270/main/assets/liang25a/liang25a.pdf}, url = {https://proceedings.mlr.press/v270/liang25a.html}, abstract = {Recent work has demonstrated that a promising strategy for teaching robots a wide range of complex skills is by training them on a curriculum of progressively more challenging environments. However, developing an effective curriculum of environment distributions currently requires significant expertise, which must be repeated for every new domain. Our key insight is that environments are often naturally represented as code. Thus, we probe whether effective environment curriculum design can be achieved and automated via code generation by large language models (LLM). In this paper, we introduce Eurekaverse, an unsupervised environment design algorithm that uses LLMs to sample progressively more challenging, diverse, and learnable environments for skill training. We validate Eurekaverse’s effectiveness in the domain of quadrupedal parkour learning, in which a quadruped robot must traverse through a variety of obstacle courses. The automatic curriculum designed by Eurekaverse enables gradual learning of complex parkour skills in simulation and can successfully transfer to the real-world, outperforming manual training courses designed by humans.} }
Endnote
%0 Conference Paper %T Environment Curriculum Generation via Large Language Models %A William Liang %A Sam Wang %A Hung-Ju Wang %A Osbert Bastani %A Dinesh Jayaraman %A Yecheng Jason Ma %B Proceedings of The 8th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Pulkit Agrawal %E Oliver Kroemer %E Wolfram Burgard %F pmlr-v270-liang25a %I PMLR %P 433--454 %U https://proceedings.mlr.press/v270/liang25a.html %V 270 %X Recent work has demonstrated that a promising strategy for teaching robots a wide range of complex skills is by training them on a curriculum of progressively more challenging environments. However, developing an effective curriculum of environment distributions currently requires significant expertise, which must be repeated for every new domain. Our key insight is that environments are often naturally represented as code. Thus, we probe whether effective environment curriculum design can be achieved and automated via code generation by large language models (LLM). In this paper, we introduce Eurekaverse, an unsupervised environment design algorithm that uses LLMs to sample progressively more challenging, diverse, and learnable environments for skill training. We validate Eurekaverse’s effectiveness in the domain of quadrupedal parkour learning, in which a quadruped robot must traverse through a variety of obstacle courses. The automatic curriculum designed by Eurekaverse enables gradual learning of complex parkour skills in simulation and can successfully transfer to the real-world, outperforming manual training courses designed by humans.
APA
Liang, W., Wang, S., Wang, H., Bastani, O., Jayaraman, D. & Ma, Y.J.. (2025). Environment Curriculum Generation via Large Language Models. Proceedings of The 8th Conference on Robot Learning, in Proceedings of Machine Learning Research 270:433-454 Available from https://proceedings.mlr.press/v270/liang25a.html.

Related Material