Accelerating Visual Sparse-Reward Learning with Latent Nearest-Demonstration-Guided Explorations

Ruihan Zhao, ufuk topcu, Sandeep P. Chinchali, Mariano Phielipp
Proceedings of The 8th Conference on Robot Learning, PMLR 270:5294-5311, 2025.

Abstract

Recent progress in deep reinforcement learning (RL) and computer vision enables artificial agents to solve complex tasks, including locomotion, manipulation, and video games from high-dimensional pixel observations. However, RL usually relies on domain-specific reward functions for sufficient learning signals, requiring expert knowledge. While vision-based agents could learn skills from only sparse rewards, exploration challenges arise. We present Latent Nearest-demonstration-guided Exploration (LaNE), a novel and efficient method to solve sparse-reward robot manipulation tasks from image observations and a few demonstrations. First, LaNE builds on the pre-trained DINOv2 feature extractor to learn an embedding space for forward prediction. Next, it rewards the agent for exploring near the demos, quantified by quadratic control costs in the embedding space. Finally, LaNE optimizes the policy for the augmented rewards with RL. Experiments demonstrate that our method achieves state-of-the-art sample efficiency in Robosuite simulation and enables under-an-hour RL training from scratch on a Franka Panda robot, using only a few demonstrations.

Cite this Paper


BibTeX
@InProceedings{pmlr-v270-zhao25e, title = {Accelerating Visual Sparse-Reward Learning with Latent Nearest-Demonstration-Guided Explorations}, author = {Zhao, Ruihan and topcu, ufuk and Chinchali, Sandeep P. and Phielipp, Mariano}, booktitle = {Proceedings of The 8th Conference on Robot Learning}, pages = {5294--5311}, year = {2025}, editor = {Agrawal, Pulkit and Kroemer, Oliver and Burgard, Wolfram}, volume = {270}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v270/main/assets/zhao25e/zhao25e.pdf}, url = {https://proceedings.mlr.press/v270/zhao25e.html}, abstract = {Recent progress in deep reinforcement learning (RL) and computer vision enables artificial agents to solve complex tasks, including locomotion, manipulation, and video games from high-dimensional pixel observations. However, RL usually relies on domain-specific reward functions for sufficient learning signals, requiring expert knowledge. While vision-based agents could learn skills from only sparse rewards, exploration challenges arise. We present Latent Nearest-demonstration-guided Exploration (LaNE), a novel and efficient method to solve sparse-reward robot manipulation tasks from image observations and a few demonstrations. First, LaNE builds on the pre-trained DINOv2 feature extractor to learn an embedding space for forward prediction. Next, it rewards the agent for exploring near the demos, quantified by quadratic control costs in the embedding space. Finally, LaNE optimizes the policy for the augmented rewards with RL. Experiments demonstrate that our method achieves state-of-the-art sample efficiency in Robosuite simulation and enables under-an-hour RL training from scratch on a Franka Panda robot, using only a few demonstrations.} }
Endnote
%0 Conference Paper %T Accelerating Visual Sparse-Reward Learning with Latent Nearest-Demonstration-Guided Explorations %A Ruihan Zhao %A ufuk topcu %A Sandeep P. Chinchali %A Mariano Phielipp %B Proceedings of The 8th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2025 %E Pulkit Agrawal %E Oliver Kroemer %E Wolfram Burgard %F pmlr-v270-zhao25e %I PMLR %P 5294--5311 %U https://proceedings.mlr.press/v270/zhao25e.html %V 270 %X Recent progress in deep reinforcement learning (RL) and computer vision enables artificial agents to solve complex tasks, including locomotion, manipulation, and video games from high-dimensional pixel observations. However, RL usually relies on domain-specific reward functions for sufficient learning signals, requiring expert knowledge. While vision-based agents could learn skills from only sparse rewards, exploration challenges arise. We present Latent Nearest-demonstration-guided Exploration (LaNE), a novel and efficient method to solve sparse-reward robot manipulation tasks from image observations and a few demonstrations. First, LaNE builds on the pre-trained DINOv2 feature extractor to learn an embedding space for forward prediction. Next, it rewards the agent for exploring near the demos, quantified by quadratic control costs in the embedding space. Finally, LaNE optimizes the policy for the augmented rewards with RL. Experiments demonstrate that our method achieves state-of-the-art sample efficiency in Robosuite simulation and enables under-an-hour RL training from scratch on a Franka Panda robot, using only a few demonstrations.
APA
Zhao, R., topcu, u., Chinchali, S.P. & Phielipp, M.. (2025). Accelerating Visual Sparse-Reward Learning with Latent Nearest-Demonstration-Guided Explorations. Proceedings of The 8th Conference on Robot Learning, in Proceedings of Machine Learning Research 270:5294-5311 Available from https://proceedings.mlr.press/v270/zhao25e.html.

Related Material