Value Preserving State-Action Abstractions

David Abel, Nate Umbanhowar, Khimya Khetarpal, Dilip Arumugam, Doina Precup, Michael Littman
; Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:1639-1650, 2020.

Abstract

Abstraction can improve the sample efficiency of reinforcement learning. However, the process of abstraction inherently discards information, potentially compromising an agent’s ability to represent high-value policies. To mitigate this, we here introduce combinations of state abstractions and options that are guaranteed to preserve representation of near-optimal policies. We first define $\phi$-relative options, a general formalism for analyzing the value loss of options paired with a state abstraction, and present necessary and sufficient conditions for $\phi$-relative options to preserve near-optimal behavior in any finite Markov Decision Process. We further show that, under appropriate assumptions, $\phi$-relative options can be composed to induce hierarchical abstractions that are also guaranteed to represent high-value policies.

Cite this Paper


BibTeX
@InProceedings{pmlr-v108-abel20a, title = {Value Preserving State-Action Abstractions}, author = {Abel, David and Umbanhowar, Nate and Khetarpal, Khimya and Arumugam, Dilip and Precup, Doina and Littman, Michael}, pages = {1639--1650}, year = {2020}, editor = {Silvia Chiappa and Roberto Calandra}, volume = {108}, series = {Proceedings of Machine Learning Research}, address = {Online}, month = {26--28 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v108/abel20a/abel20a.pdf}, url = {http://proceedings.mlr.press/v108/abel20a.html}, abstract = {Abstraction can improve the sample efficiency of reinforcement learning. However, the process of abstraction inherently discards information, potentially compromising an agent’s ability to represent high-value policies. To mitigate this, we here introduce combinations of state abstractions and options that are guaranteed to preserve representation of near-optimal policies. We first define $\phi$-relative options, a general formalism for analyzing the value loss of options paired with a state abstraction, and present necessary and sufficient conditions for $\phi$-relative options to preserve near-optimal behavior in any finite Markov Decision Process. We further show that, under appropriate assumptions, $\phi$-relative options can be composed to induce hierarchical abstractions that are also guaranteed to represent high-value policies.} }
Endnote
%0 Conference Paper %T Value Preserving State-Action Abstractions %A David Abel %A Nate Umbanhowar %A Khimya Khetarpal %A Dilip Arumugam %A Doina Precup %A Michael Littman %B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2020 %E Silvia Chiappa %E Roberto Calandra %F pmlr-v108-abel20a %I PMLR %J Proceedings of Machine Learning Research %P 1639--1650 %U http://proceedings.mlr.press %V 108 %W PMLR %X Abstraction can improve the sample efficiency of reinforcement learning. However, the process of abstraction inherently discards information, potentially compromising an agent’s ability to represent high-value policies. To mitigate this, we here introduce combinations of state abstractions and options that are guaranteed to preserve representation of near-optimal policies. We first define $\phi$-relative options, a general formalism for analyzing the value loss of options paired with a state abstraction, and present necessary and sufficient conditions for $\phi$-relative options to preserve near-optimal behavior in any finite Markov Decision Process. We further show that, under appropriate assumptions, $\phi$-relative options can be composed to induce hierarchical abstractions that are also guaranteed to represent high-value policies.
APA
Abel, D., Umbanhowar, N., Khetarpal, K., Arumugam, D., Precup, D. & Littman, M.. (2020). Value Preserving State-Action Abstractions. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in PMLR 108:1639-1650

Related Material