A Laplacian Framework for Option Discovery in Reinforcement Learning

Marlos C. Machado; Marc G. Bellemare; Michael Bowling

A Laplacian Framework for Option Discovery in Reinforcement Learning

Marlos C. Machado, Marc G. Bellemare, Michael Bowling

Proceedings of the 34th International Conference on Machine Learning, PMLR 70:2295-2304, 2017.

Abstract

Representation learning and option discovery are two of the biggest challenges in reinforcement learning (RL). Proto-value functions (PVFs) are a well-known approach for representation learning in MDPs. In this paper we address the option discovery problem by showing how PVFs implicitly define options. We do it by introducing eigenpurposes, intrinsic reward functions derived from the learned representations. The options discovered from eigenpurposes traverse the principal directions of the state space. They are useful for multiple tasks because they are discovered without taking the environment’s rewards into consideration. Moreover, different options act at different time scales, making them helpful for exploration. We demonstrate features of eigenpurposes in traditional tabular domains as well as in Atari 2600 games.

Cite this Paper

BibTeX


@InProceedings{pmlr-v70-machado17a,
  title = 	 {A {L}aplacian Framework for Option Discovery in Reinforcement Learning},
  author =       {Marlos C. Machado and Marc G. Bellemare and Michael Bowling},
  booktitle = 	 {Proceedings of the 34th International Conference on Machine Learning},
  pages = 	 {2295--2304},
  year = 	 {2017},
  editor = 	 {Precup, Doina and Teh, Yee Whye},
  volume = 	 {70},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--11 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v70/machado17a/machado17a.pdf},
  url = 	 {https://proceedings.mlr.press/v70/machado17a.html},
  abstract = 	 {Representation learning and option discovery are two of the biggest challenges in reinforcement learning (RL). Proto-value functions (PVFs) are a well-known approach for representation learning in MDPs. In this paper we address the option discovery problem by showing how PVFs implicitly define options. We do it by introducing eigenpurposes, intrinsic reward functions derived from the learned representations. The options discovered from eigenpurposes traverse the principal directions of the state space. They are useful for multiple tasks because they are discovered without taking the environment’s rewards into consideration. Moreover, different options act at different time scales, making them helpful for exploration. We demonstrate features of eigenpurposes in traditional tabular domains as well as in Atari 2600 games.}
}

Endnote

%0 Conference Paper
%T A Laplacian Framework for Option Discovery in Reinforcement Learning
%A Marlos C. Machado
%A Marc G. Bellemare
%A Michael Bowling
%B Proceedings of the 34th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2017
%E Doina Precup
%E Yee Whye Teh	
%F pmlr-v70-machado17a
%I PMLR
%P 2295--2304
%U https://proceedings.mlr.press/v70/machado17a.html
%V 70
%X Representation learning and option discovery are two of the biggest challenges in reinforcement learning (RL). Proto-value functions (PVFs) are a well-known approach for representation learning in MDPs. In this paper we address the option discovery problem by showing how PVFs implicitly define options. We do it by introducing eigenpurposes, intrinsic reward functions derived from the learned representations. The options discovered from eigenpurposes traverse the principal directions of the state space. They are useful for multiple tasks because they are discovered without taking the environment’s rewards into consideration. Moreover, different options act at different time scales, making them helpful for exploration. We demonstrate features of eigenpurposes in traditional tabular domains as well as in Atari 2600 games.

APA


Machado, M.C., Bellemare, M.G. & Bowling, M.. (2017). A Laplacian Framework for Option Discovery in Reinforcement Learning. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:2295-2304 Available from https://proceedings.mlr.press/v70/machado17a.html.

A Laplacian Framework for Option Discovery in Reinforcement Learning

Abstract

Cite this Paper

Related Material