A Bayesian Approach to Robust Reinforcement Learning

Esther Derman; Daniel Mankowitz; Timothy Mann; Shie Mannor

A Bayesian Approach to Robust Reinforcement Learning

Esther Derman, Daniel Mankowitz, Timothy Mann, Shie Mannor

Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, PMLR 115:648-658, 2020.

Abstract

Robust Markov Decision Processes (RMDPs) intend to ensure robustness with respect to changing or adversarial system behavior. In this framework, transitions are modeled as arbitrary elements of a known and properly structured uncertainty set and a robust optimal policy can be derived under the worst-case scenario. In this study, we address the issue of learning in RMDPs using a Bayesian approach. We introduce the Uncertainty Robust Bellman Equation (URBE) which encourages safe exploration for adapting the uncertainty set to new observations while preserving robustness. We propose a URBE-based algorithm, DQN-URBE, that scales this method to higher dimensional domains. Our experiments show that the derived URBE-based strategy leads to a better trade-off between less conservative solutions and robustness in the presence of model misspecification. In addition, we show that the DQN-URBE algorithm can adapt significantly faster to changing dynamics online compared to existing robust techniques with fixed uncertainty sets.

Cite this Paper

BibTeX

@InProceedings{pmlr-v115-derman20a,
  title = 	 {A Bayesian Approach to Robust Reinforcement Learning},
  author =       {Derman, Esther and Mankowitz, Daniel and Mann, Timothy and Mannor, Shie},
  booktitle = 	 {Proceedings of The 35th Uncertainty in Artificial Intelligence Conference},
  pages = 	 {648--658},
  year = 	 {2020},
  editor = 	 {Adams, Ryan P. and Gogate, Vibhav},
  volume = 	 {115},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {22--25 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v115/derman20a/derman20a.pdf},
  url = 	 {https://proceedings.mlr.press/v115/derman20a.html},
  abstract = 	 {Robust Markov Decision Processes (RMDPs) intend to ensure robustness with respect to changing or adversarial system behavior. In this framework, transitions are modeled as arbitrary elements of a known and properly structured uncertainty set and a robust optimal policy can be derived under the worst-case scenario. In this study, we address the issue of learning in RMDPs using a Bayesian approach. We introduce the Uncertainty Robust Bellman Equation (URBE) which encourages safe exploration for adapting the uncertainty set to new observations while preserving robustness. We propose a URBE-based algorithm, DQN-URBE, that scales this method to higher dimensional domains. Our experiments show that the derived URBE-based strategy leads to a better trade-off between less conservative solutions and robustness in the presence of model misspecification. In addition, we show that the DQN-URBE algorithm can adapt significantly faster to changing dynamics online compared to existing robust techniques with fixed uncertainty sets.}
}

Endnote

%0 Conference Paper
%T A Bayesian Approach to Robust Reinforcement Learning
%A Esther Derman
%A Daniel Mankowitz
%A Timothy Mann
%A Shie Mannor
%B Proceedings of The 35th Uncertainty in Artificial Intelligence Conference
%C Proceedings of Machine Learning Research
%D 2020
%E Ryan P. Adams
%E Vibhav Gogate	
%F pmlr-v115-derman20a
%I PMLR
%P 648--658
%U https://proceedings.mlr.press/v115/derman20a.html
%V 115
%X Robust Markov Decision Processes (RMDPs) intend to ensure robustness with respect to changing or adversarial system behavior. In this framework, transitions are modeled as arbitrary elements of a known and properly structured uncertainty set and a robust optimal policy can be derived under the worst-case scenario. In this study, we address the issue of learning in RMDPs using a Bayesian approach. We introduce the Uncertainty Robust Bellman Equation (URBE) which encourages safe exploration for adapting the uncertainty set to new observations while preserving robustness. We propose a URBE-based algorithm, DQN-URBE, that scales this method to higher dimensional domains. Our experiments show that the derived URBE-based strategy leads to a better trade-off between less conservative solutions and robustness in the presence of model misspecification. In addition, we show that the DQN-URBE algorithm can adapt significantly faster to changing dynamics online compared to existing robust techniques with fixed uncertainty sets.

APA

Derman, E., Mankowitz, D., Mann, T. & Mannor, S.. (2020). A Bayesian Approach to Robust Reinforcement Learning. Proceedings of The 35th Uncertainty in Artificial Intelligence Conference, in Proceedings of Machine Learning Research 115:648-658 Available from https://proceedings.mlr.press/v115/derman20a.html.

A Bayesian Approach to Robust Reinforcement Learning

Abstract

Cite this Paper

Related Material