Learning Factored Markov Decision Processes with Unawareness

Craig Innes; Alex Lascarides

Learning Factored Markov Decision Processes with Unawareness

Craig Innes, Alex Lascarides

Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, PMLR 115:123-133, 2020.

Abstract

Methods for learning and planning in sequential decision problems often assume the learner is aware of all possible states and actions in advance. This assumption is sometimes untenable. In this paper, we give a method to learn factored markov decision problems from both domain exploration and expert assistance, which guarantees convergence to near-optimal behaviour, even when the agent begins unaware of factors critical to success. Our experiments show our agent learns optimal behaviour on both small and large problems, and that conserving information on discovering new possibilities results in faster convergence.

Cite this Paper

BibTeX


@InProceedings{pmlr-v115-innes20a,
  title = 	 {Learning Factored Markov Decision Processes with Unawareness},
  author =       {Innes, Craig and Lascarides, Alex},
  booktitle = 	 {Proceedings of The 35th Uncertainty in Artificial Intelligence Conference},
  pages = 	 {123--133},
  year = 	 {2020},
  editor = 	 {Adams, Ryan P. and Gogate, Vibhav},
  volume = 	 {115},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {22--25 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v115/innes20a/innes20a.pdf},
  url = 	 {https://proceedings.mlr.press/v115/innes20a.html},
  abstract = 	 {Methods for learning and planning in sequential decision problems often assume the learner is aware of all possible states and actions in advance. This assumption is sometimes untenable. In this paper, we give a method to learn factored markov decision problems from both domain exploration and expert assistance, which guarantees convergence to near-optimal behaviour, even when the agent begins unaware of factors critical to success. Our experiments show our agent learns optimal behaviour on both small and large problems, and that conserving information on discovering new possibilities results in faster convergence.}
}

Endnote

%0 Conference Paper
%T Learning Factored Markov Decision Processes with Unawareness
%A Craig Innes
%A Alex Lascarides
%B Proceedings of The 35th Uncertainty in Artificial Intelligence Conference
%C Proceedings of Machine Learning Research
%D 2020
%E Ryan P. Adams
%E Vibhav Gogate	
%F pmlr-v115-innes20a
%I PMLR
%P 123--133
%U https://proceedings.mlr.press/v115/innes20a.html
%V 115
%X Methods for learning and planning in sequential decision problems often assume the learner is aware of all possible states and actions in advance. This assumption is sometimes untenable. In this paper, we give a method to learn factored markov decision problems from both domain exploration and expert assistance, which guarantees convergence to near-optimal behaviour, even when the agent begins unaware of factors critical to success. Our experiments show our agent learns optimal behaviour on both small and large problems, and that conserving information on discovering new possibilities results in faster convergence.

APA


Innes, C. & Lascarides, A.. (2020). Learning Factored Markov Decision Processes with Unawareness. Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, in Proceedings of Machine Learning Research 115:123-133 Available from https://proceedings.mlr.press/v115/innes20a.html.

Learning Factored Markov Decision Processes with Unawareness

Abstract

Cite this Paper

Related Material