Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings

Jesse Zhang; Brian Cheung; Chelsea Finn; Sergey Levine; Dinesh Jayaraman

Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings

Jesse Zhang, Brian Cheung, Chelsea Finn, Sergey Levine, Dinesh Jayaraman

Proceedings of the 37th International Conference on Machine Learning, PMLR 119:11055-11065, 2020.

Abstract

Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous, imperiling the RL agent, other agents, and the environment. To overcome this difficulty, we propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments such as in a simulator, before it adapts to the target environment where failures carry heavy costs. We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk, which in turn enables relative safety through risk-averse, cautious adaptation. CARL first employs model-based RL to train a probabilistic model to capture uncertainty about transition dynamics and catastrophic states across varied source environments. Then, when exploring a new safety-critical environment with unknown dynamics, the CARL agent plans to avoid actions that could lead to catastrophic states. In experiments on car driving, cartpole balancing, and half-cheetah locomotion, CARL successfully acquires cautious exploration behaviors, yielding higher rewards with fewer failures than strong RL adaptation baselines.

Cite this Paper

BibTeX


@InProceedings{pmlr-v119-zhang20e,
  title = 	 {Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings},
  author =       {Zhang, Jesse and Cheung, Brian and Finn, Chelsea and Levine, Sergey and Jayaraman, Dinesh},
  booktitle = 	 {Proceedings of the 37th International Conference on Machine Learning},
  pages = 	 {11055--11065},
  year = 	 {2020},
  editor = 	 {III, Hal Daumé and Singh, Aarti},
  volume = 	 {119},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--18 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v119/zhang20e/zhang20e.pdf},
  url = 	 {https://proceedings.mlr.press/v119/zhang20e.html},
  abstract = 	 {Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous, imperiling the RL agent, other agents, and the environment. To overcome this difficulty, we propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments such as in a simulator, before it adapts to the target environment where failures carry heavy costs. We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk, which in turn enables relative safety through risk-averse, cautious adaptation. CARL first employs model-based RL to train a probabilistic model to capture uncertainty about transition dynamics and catastrophic states across varied source environments. Then, when exploring a new safety-critical environment with unknown dynamics, the CARL agent plans to avoid actions that could lead to catastrophic states. In experiments on car driving, cartpole balancing, and half-cheetah locomotion, CARL successfully acquires cautious exploration behaviors, yielding higher rewards with fewer failures than strong RL adaptation baselines.}
}

Endnote

%0 Conference Paper
%T Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings
%A Jesse Zhang
%A Brian Cheung
%A Chelsea Finn
%A Sergey Levine
%A Dinesh Jayaraman
%B Proceedings of the 37th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2020
%E Hal Daumé III
%E Aarti Singh	
%F pmlr-v119-zhang20e
%I PMLR
%P 11055--11065
%U https://proceedings.mlr.press/v119/zhang20e.html
%V 119
%X Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous, imperiling the RL agent, other agents, and the environment. To overcome this difficulty, we propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments such as in a simulator, before it adapts to the target environment where failures carry heavy costs. We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk, which in turn enables relative safety through risk-averse, cautious adaptation. CARL first employs model-based RL to train a probabilistic model to capture uncertainty about transition dynamics and catastrophic states across varied source environments. Then, when exploring a new safety-critical environment with unknown dynamics, the CARL agent plans to avoid actions that could lead to catastrophic states. In experiments on car driving, cartpole balancing, and half-cheetah locomotion, CARL successfully acquires cautious exploration behaviors, yielding higher rewards with fewer failures than strong RL adaptation baselines.

APA


Zhang, J., Cheung, B., Finn, C., Levine, S. & Jayaraman, D.. (2020). Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:11055-11065 Available from https://proceedings.mlr.press/v119/zhang20e.html.

Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings

Abstract

Cite this Paper

Related Material