Kernel-Based Reinforcement Learning in Robust Markov Decision Processes

Shiau Hong Lim; Arnaud Autef

Kernel-Based Reinforcement Learning in Robust Markov Decision Processes

Shiau Hong Lim, Arnaud Autef

Proceedings of the 36th International Conference on Machine Learning, PMLR 97:3973-3981, 2019.

Abstract

The robust Markov decision processes (MDP) framework aims to address the problem of parameter uncertainty due to model mismatch, approximation errors or even adversarial behaviors. It is especially relevant when deploying the learned policies in real-world applications. Scaling up the robust MDP framework to large or continuous state space remains a challenging problem. The use of function approximation in this case is usually inevitable and this can only amplify the problem of model mismatch and parameter uncertainties. It has been previously shown that, in the case of MDPs with state aggregation, the robust policies enjoy a tighter performance bound compared to standard solutions due to its reduced sensitivity to approximation errors. We extend these results to the much larger class of kernel-based approximators and show, both analytically and empirically that the robust policies can significantly outperform the non-robust counterpart.

Cite this Paper

BibTeX


@InProceedings{pmlr-v97-lim19a,
  title = 	 {Kernel-Based Reinforcement Learning in Robust {M}arkov Decision Processes},
  author =       {Lim, Shiau Hong and Autef, Arnaud},
  booktitle = 	 {Proceedings of the 36th International Conference on Machine Learning},
  pages = 	 {3973--3981},
  year = 	 {2019},
  editor = 	 {Chaudhuri, Kamalika and Salakhutdinov, Ruslan},
  volume = 	 {97},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {09--15 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v97/lim19a/lim19a.pdf},
  url = 	 {https://proceedings.mlr.press/v97/lim19a.html},
  abstract = 	 {The robust Markov decision processes (MDP) framework aims to address the problem of parameter uncertainty due to model mismatch, approximation errors or even adversarial behaviors. It is especially relevant when deploying the learned policies in real-world applications. Scaling up the robust MDP framework to large or continuous state space remains a challenging problem. The use of function approximation in this case is usually inevitable and this can only amplify the problem of model mismatch and parameter uncertainties. It has been previously shown that, in the case of MDPs with state aggregation, the robust policies enjoy a tighter performance bound compared to standard solutions due to its reduced sensitivity to approximation errors. We extend these results to the much larger class of kernel-based approximators and show, both analytically and empirically that the robust policies can significantly outperform the non-robust counterpart.}
}

Endnote

%0 Conference Paper
%T Kernel-Based Reinforcement Learning in Robust Markov Decision Processes
%A Shiau Hong Lim
%A Arnaud Autef
%B Proceedings of the 36th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2019
%E Kamalika Chaudhuri
%E Ruslan Salakhutdinov	
%F pmlr-v97-lim19a
%I PMLR
%P 3973--3981
%U https://proceedings.mlr.press/v97/lim19a.html
%V 97
%X The robust Markov decision processes (MDP) framework aims to address the problem of parameter uncertainty due to model mismatch, approximation errors or even adversarial behaviors. It is especially relevant when deploying the learned policies in real-world applications. Scaling up the robust MDP framework to large or continuous state space remains a challenging problem. The use of function approximation in this case is usually inevitable and this can only amplify the problem of model mismatch and parameter uncertainties. It has been previously shown that, in the case of MDPs with state aggregation, the robust policies enjoy a tighter performance bound compared to standard solutions due to its reduced sensitivity to approximation errors. We extend these results to the much larger class of kernel-based approximators and show, both analytically and empirically that the robust policies can significantly outperform the non-robust counterpart.

APA


Lim, S.H. & Autef, A.. (2019). Kernel-Based Reinforcement Learning in Robust Markov Decision Processes. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:3973-3981 Available from https://proceedings.mlr.press/v97/lim19a.html.

Kernel-Based Reinforcement Learning in Robust Markov Decision Processes

Abstract

Cite this Paper

Related Material