Uncertainty-driven Imagination for Continuous Deep Reinforcement Learning
Proceedings of the 1st Annual Conference on Robot Learning, PMLR 78:195-206, 2017.
Continuous control of high-dimensional systems can be achieved by current state-of-the-art reinforcement learning methods such as the Deep Deterministic Policy Gradient algorithm, but needs a significant amount of data samples. For real-world systems, this can be an obstacle since excessive data collection can be expensive, tedious or lead to physical damage. The main incentive of this work is to keep the advantages of model-free Q-learning while minimizing real-world interaction by the employment of a dynamics model learned in parallel. To counteract adverse effects of imaginary rollouts with an inaccurate model, a notion of uncertainty is introduced, to make use of artificial data only in cases of high uncertainty. We evaluate our approach on three simulated robot tasks and achieve faster learning by at least 40 per cent in comparison to vanilla DDPG with multiple updates.