EXTRACT: Efficient Policy Learning by Extracting Transferable Robot Skills from Offline Data

Jesse Zhang, Minho Heo, Zuxin Liu, Erdem Biyik, Joseph J Lim, Yao Liu, Rasool Fakoor
Proceedings of The 8th Conference on Robot Learning, PMLR 270:1148-1172, 2025.

Abstract

Most reinforcement learning (RL) methods focus on learning optimal policies over low-level action spaces. While these methods can perform well in their training environments, they lack the flexibility to transfer to new tasks. Instead, RL agents that can act over useful, temporally extended skills rather than low-level actions can learn new tasks more easily. Prior work in skill-based RL either requires expert supervision to define useful skills, which is hard to scale, or learns a skill-space from offline data with heuristics that limit the adaptability of the skills, making them difficult to transfer during downstream RL. Our approach, EXTRACT, instead utilizes pre-trained vision language models to extract a discrete set of semantically meaningful skills from offline data, each of which is parameterized by continuous arguments, without human supervision. This skill parameterization allows robots to learn new tasks by only needing to learn when to select a specific skill and how to modify its arguments for the specific task. We demonstrate through experiments in sparse-reward, image-based, robot manipulation environments that EXTRACT can more quickly learn new tasks than prior works, with major gains in sample efficiency and performance over prior skill-based RL.

Cite this Paper


BibTeX
@InProceedings{pmlr-v270-zhang25c, title = {EXTRACT: Efficient Policy Learning by Extracting Transferable Robot Skills from Offline Data}, author = {Zhang, Jesse and Heo, Minho and Liu, Zuxin and Biyik, Erdem and Lim, Joseph J and Liu, Yao and Fakoor, Rasool}, booktitle = {Proceedings of The 8th Conference on Robot Learning}, pages = {1148--1172}, year = {2025}, editor = {Agrawal, Pulkit and Kroemer, Oliver and Burgard, Wolfram}, volume = {270}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v270/main/assets/zhang25c/zhang25c.pdf}, url = {https://proceedings.mlr.press/v270/zhang25c.html}, abstract = {Most reinforcement learning (RL) methods focus on learning optimal policies over low-level action spaces. While these methods can perform well in their training environments, they lack the flexibility to transfer to new tasks. Instead, RL agents that can act over useful, temporally extended skills rather than low-level actions can learn new tasks more easily. Prior work in skill-based RL either requires expert supervision to define useful skills, which is hard to scale, or learns a skill-space from offline data with heuristics that limit the adaptability of the skills, making them difficult to transfer during downstream RL. Our approach, EXTRACT, instead utilizes pre-trained vision language models to extract a discrete set of semantically meaningful skills from offline data, each of which is parameterized by continuous arguments, without human supervision. This skill parameterization allows robots to learn new tasks by only needing to learn when to select a specific skill and how to modify its arguments for the specific task. We demonstrate through experiments in sparse-reward, image-based, robot manipulation environments that EXTRACT can more quickly learn new tasks than prior works, with major gains in sample efficiency and performance over prior skill-based RL.} }
Endnote
%0 Conference Paper %T EXTRACT: Efficient Policy Learning by Extracting Transferable Robot Skills from Offline Data %A Jesse Zhang %A Minho Heo %A Zuxin Liu %A Erdem Biyik %A Joseph J Lim %A Yao Liu %A Rasool Fakoor %B Proceedings of The 8th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Pulkit Agrawal %E Oliver Kroemer %E Wolfram Burgard %F pmlr-v270-zhang25c %I PMLR %P 1148--1172 %U https://proceedings.mlr.press/v270/zhang25c.html %V 270 %X Most reinforcement learning (RL) methods focus on learning optimal policies over low-level action spaces. While these methods can perform well in their training environments, they lack the flexibility to transfer to new tasks. Instead, RL agents that can act over useful, temporally extended skills rather than low-level actions can learn new tasks more easily. Prior work in skill-based RL either requires expert supervision to define useful skills, which is hard to scale, or learns a skill-space from offline data with heuristics that limit the adaptability of the skills, making them difficult to transfer during downstream RL. Our approach, EXTRACT, instead utilizes pre-trained vision language models to extract a discrete set of semantically meaningful skills from offline data, each of which is parameterized by continuous arguments, without human supervision. This skill parameterization allows robots to learn new tasks by only needing to learn when to select a specific skill and how to modify its arguments for the specific task. We demonstrate through experiments in sparse-reward, image-based, robot manipulation environments that EXTRACT can more quickly learn new tasks than prior works, with major gains in sample efficiency and performance over prior skill-based RL.
APA
Zhang, J., Heo, M., Liu, Z., Biyik, E., Lim, J.J., Liu, Y. & Fakoor, R.. (2025). EXTRACT: Efficient Policy Learning by Extracting Transferable Robot Skills from Offline Data. Proceedings of The 8th Conference on Robot Learning, in Proceedings of Machine Learning Research 270:1148-1172 Available from https://proceedings.mlr.press/v270/zhang25c.html.

Related Material