Discovering Options for Exploration by Minimizing Cover Time

Yuu Jinnai, Jee Won Park, David Abel, George Konidaris
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:3130-3139, 2019.

Abstract

One of the main challenges in reinforcement learning is solving tasks with sparse reward. We show that the difficulty of discovering a distant rewarding state in an MDP is bounded by the expected cover time of a random walk over the graph induced by the MDP’s transition dynamics. We therefore propose to accelerate exploration by constructing options that minimize cover time. We introduce a new option discovery algorithm that diminishes the expected cover time by connecting the most distant states in the state-space graph with options. We show empirically that the proposed algorithm improves learning in several domains with sparse rewards.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-jinnai19b, title = {Discovering Options for Exploration by Minimizing Cover Time}, author = {Jinnai, Yuu and Park, Jee Won and Abel, David and Konidaris, George}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {3130--3139}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/jinnai19b/jinnai19b.pdf}, url = {https://proceedings.mlr.press/v97/jinnai19b.html}, abstract = {One of the main challenges in reinforcement learning is solving tasks with sparse reward. We show that the difficulty of discovering a distant rewarding state in an MDP is bounded by the expected cover time of a random walk over the graph induced by the MDP’s transition dynamics. We therefore propose to accelerate exploration by constructing options that minimize cover time. We introduce a new option discovery algorithm that diminishes the expected cover time by connecting the most distant states in the state-space graph with options. We show empirically that the proposed algorithm improves learning in several domains with sparse rewards.} }
Endnote
%0 Conference Paper %T Discovering Options for Exploration by Minimizing Cover Time %A Yuu Jinnai %A Jee Won Park %A David Abel %A George Konidaris %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-jinnai19b %I PMLR %P 3130--3139 %U https://proceedings.mlr.press/v97/jinnai19b.html %V 97 %X One of the main challenges in reinforcement learning is solving tasks with sparse reward. We show that the difficulty of discovering a distant rewarding state in an MDP is bounded by the expected cover time of a random walk over the graph induced by the MDP’s transition dynamics. We therefore propose to accelerate exploration by constructing options that minimize cover time. We introduce a new option discovery algorithm that diminishes the expected cover time by connecting the most distant states in the state-space graph with options. We show empirically that the proposed algorithm improves learning in several domains with sparse rewards.
APA
Jinnai, Y., Park, J.W., Abel, D. & Konidaris, G.. (2019). Discovering Options for Exploration by Minimizing Cover Time. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:3130-3139 Available from https://proceedings.mlr.press/v97/jinnai19b.html.

Related Material