Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings

Jesse Zhang, Brian Cheung, Chelsea Finn, Sergey Levine, Dinesh Jayaraman
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:11055-11065, 2020.

Abstract

Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous, imperiling the RL agent, other agents, and the environment. To overcome this difficulty, we propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments such as in a simulator, before it adapts to the target environment where failures carry heavy costs. We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk, which in turn enables relative safety through risk-averse, cautious adaptation. CARL first employs model-based RL to train a probabilistic model to capture uncertainty about transition dynamics and catastrophic states across varied source environments. Then, when exploring a new safety-critical environment with unknown dynamics, the CARL agent plans to avoid actions that could lead to catastrophic states. In experiments on car driving, cartpole balancing, and half-cheetah locomotion, CARL successfully acquires cautious exploration behaviors, yielding higher rewards with fewer failures than strong RL adaptation baselines.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-zhang20e, title = {Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings}, author = {Zhang, Jesse and Cheung, Brian and Finn, Chelsea and Levine, Sergey and Jayaraman, Dinesh}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {11055--11065}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/zhang20e/zhang20e.pdf}, url = {https://proceedings.mlr.press/v119/zhang20e.html}, abstract = {Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous, imperiling the RL agent, other agents, and the environment. To overcome this difficulty, we propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments such as in a simulator, before it adapts to the target environment where failures carry heavy costs. We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk, which in turn enables relative safety through risk-averse, cautious adaptation. CARL first employs model-based RL to train a probabilistic model to capture uncertainty about transition dynamics and catastrophic states across varied source environments. Then, when exploring a new safety-critical environment with unknown dynamics, the CARL agent plans to avoid actions that could lead to catastrophic states. In experiments on car driving, cartpole balancing, and half-cheetah locomotion, CARL successfully acquires cautious exploration behaviors, yielding higher rewards with fewer failures than strong RL adaptation baselines.} }
Endnote
%0 Conference Paper %T Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings %A Jesse Zhang %A Brian Cheung %A Chelsea Finn %A Sergey Levine %A Dinesh Jayaraman %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-zhang20e %I PMLR %P 11055--11065 %U https://proceedings.mlr.press/v119/zhang20e.html %V 119 %X Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous, imperiling the RL agent, other agents, and the environment. To overcome this difficulty, we propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments such as in a simulator, before it adapts to the target environment where failures carry heavy costs. We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk, which in turn enables relative safety through risk-averse, cautious adaptation. CARL first employs model-based RL to train a probabilistic model to capture uncertainty about transition dynamics and catastrophic states across varied source environments. Then, when exploring a new safety-critical environment with unknown dynamics, the CARL agent plans to avoid actions that could lead to catastrophic states. In experiments on car driving, cartpole balancing, and half-cheetah locomotion, CARL successfully acquires cautious exploration behaviors, yielding higher rewards with fewer failures than strong RL adaptation baselines.
APA
Zhang, J., Cheung, B., Finn, C., Levine, S. & Jayaraman, D.. (2020). Cautious Adaptation For Reinforcement Learning in Safety-Critical Settings. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:11055-11065 Available from https://proceedings.mlr.press/v119/zhang20e.html.

Related Material