[edit]
Diffusion Augmented Agents: A Framework for Efficient Exploration and Transfer Learning
Proceedings of The 3rd Conference on Lifelong Learning Agents, PMLR 274:268-284, 2025.
Abstract
We address the problem of sample-efficiency when training instruction-following embodied agents using reinforcement learning in a lifelong setting, where rewards may be sparse or absent. Our framework, which we call Diffusion Augmented Agent (DAAG), leverages a large language model (LLM), a vision language model (VLM), and a pipeline for using image diffusion models for temporally and geometrically consistent conditional video generation to hindsight relabel agent’s past experience. Given a video-instruction pair and a target instruction, we ask the LLM if our diffusion model could transform the video into one which is consistent with the target instruction, and, if so, we apply this transformation. We use such hindsight data augmentation to decrease the amount of data needed for 1) fine-tuning a VLM which acts as a reward detector as well as 2) the amount of reward-labelled data for RL training. The LLM orchestrates this process, making the entire framework autonomous and independent from human supervision, hence particularly suited for lifelong reinforcement learning scenarios. We empirically demonstrate gains in sample-efficiency when training in simulated robotics environments, including manipulation and navigation tasks, showing improvements in learning reward detectors, transferring past experience, and learning new tasks, key abilities for efficient, lifelong learning agents.