[edit]

# Inverse Reinforcement Learning without Reinforcement Learning

*Proceedings of the 40th International Conference on Machine Learning*, PMLR 202:33299-33318, 2023.

#### Abstract

Inverse Reinforcement Learning (IRL) is a powerful set of techniques for imitation learning that aims to learn a reward function that rationalizes expert demonstrations. Unfortunately, traditional IRL methods suffer from a computational weakness: they require repeatedly solving a hard reinforcement learning (RL) problem as a subroutine. This is counter-intuitive from the viewpoint of reductions: we have reduced the

*easier*problem of imitation learning to repeatedly solving the*harder*problem of RL. Another thread of work has proved that access to the side-information of the distribution of states where a strong policy spends time can dramatically reduce the sample and computational complexities of solving an RL problem. In this work, we demonstrate for the first time a more informed imitation learning reduction where we utilize the state distribution of the expert to alleviate the global exploration component of the RL subroutine, providing an*exponential*speedup in theory. In practice, we find that we are able to significantly speed up the prior art on continuous control tasks.