Collect & Infer - a fresh look at data-efficient Reinforcement Learning
Proceedings of the 5th Conference on Robot Learning, PMLR 164:1736-1744, 2022.
This position paper proposes a fresh look at Reinforcement Learning (RL) from the perspective of data-efficiency. RL has gone through three major stages: pure on-line RL where every data-point is considered only once, RL with a replay buffer where additional learning is done on a portion of the experience, and finally transition memory based RL, where, conceptually, all transitions are stored, and flexibly re-used in every update step. While inferring knowledge from all stored experience has led to a tremendous gain in data-efficiency, the question of how this data is collected has been vastly understudied. We argue that data-efficiency can only be achieved through careful consideration of both aspects. We propose to make this insight explicit via a paradigm that we call ’Collect and Infer’, which explicitly models RL as two separate but interconnected processes, concerned with data collection and knowledge inference respectively.