Value Preserving State-Action Abstractions


David Abel, Nate Umbanhowar, Khimya Khetarpal, Dilip Arumugam, Doina Precup, Michael Littman ;
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:1639-1650, 2020.


Abstraction can improve the sample efficiency of reinforcement learning. However, the process of abstraction inherently discards information, potentially compromising an agent’s ability to represent high-value policies. To mitigate this, we here introduce combinations of state abstractions and options that are guaranteed to preserve representation of near-optimal policies. We first define $\phi$-relative options, a general formalism for analyzing the value loss of options paired with a state abstraction, and present necessary and sufficient conditions for $\phi$-relative options to preserve near-optimal behavior in any finite Markov Decision Process. We further show that, under appropriate assumptions, $\phi$-relative options can be composed to induce hierarchical abstractions that are also guaranteed to represent high-value policies.

Related Material