Do no harm: A counterfactual approach to safe reinforcement learning

Sean Vaskov, Wilko Schwarting, Chris Baker
Proceedings of the 6th Annual Learning for Dynamics & Control Conference, PMLR 242:1675-1687, 2024.

Abstract

Reinforcement Learning (RL) for control has become increasingly popular due to its ability to learn feedback policies that can take into account complex representations of the environment and uncertainty. When considering safety constraints, constrained optimization approaches where agents are penalized for constraint violations are commonly used. In such methods, if agents are initialized in or must visit states where constraint violation might be inevitable, it is unclear if or how much they should be penalized. We address this challenge by formulating a constraint on the counterfactual harm of the learned policy compared to an alternate, safe policy. In a philosophical sense this method only penalizes the learner for constraint violations that it caused; in a practical sense it maintains feasibility of the optimal control problem when constraint violation is inevitable. We present simulation studies on a rover with uncertain road friction and a tractor-trailer parking environment that demonstrate our constraint formulation enables agents to learn safer policies than traditional constrained RL methods.

Cite this Paper


BibTeX
@InProceedings{pmlr-v242-vaskov24a, title = {Do no harm: {A} counterfactual approach to safe reinforcement learning}, author = {Vaskov, Sean and Schwarting, Wilko and Baker, Chris}, booktitle = {Proceedings of the 6th Annual Learning for Dynamics & Control Conference}, pages = {1675--1687}, year = {2024}, editor = {Abate, Alessandro and Cannon, Mark and Margellos, Kostas and Papachristodoulou, Antonis}, volume = {242}, series = {Proceedings of Machine Learning Research}, month = {15--17 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v242/vaskov24a/vaskov24a.pdf}, url = {https://proceedings.mlr.press/v242/vaskov24a.html}, abstract = {Reinforcement Learning (RL) for control has become increasingly popular due to its ability to learn feedback policies that can take into account complex representations of the environment and uncertainty. When considering safety constraints, constrained optimization approaches where agents are penalized for constraint violations are commonly used. In such methods, if agents are initialized in or must visit states where constraint violation might be inevitable, it is unclear if or how much they should be penalized. We address this challenge by formulating a constraint on the counterfactual harm of the learned policy compared to an alternate, safe policy. In a philosophical sense this method only penalizes the learner for constraint violations that it caused; in a practical sense it maintains feasibility of the optimal control problem when constraint violation is inevitable. We present simulation studies on a rover with uncertain road friction and a tractor-trailer parking environment that demonstrate our constraint formulation enables agents to learn safer policies than traditional constrained RL methods.} }
Endnote
%0 Conference Paper %T Do no harm: A counterfactual approach to safe reinforcement learning %A Sean Vaskov %A Wilko Schwarting %A Chris Baker %B Proceedings of the 6th Annual Learning for Dynamics & Control Conference %C Proceedings of Machine Learning Research %D 2024 %E Alessandro Abate %E Mark Cannon %E Kostas Margellos %E Antonis Papachristodoulou %F pmlr-v242-vaskov24a %I PMLR %P 1675--1687 %U https://proceedings.mlr.press/v242/vaskov24a.html %V 242 %X Reinforcement Learning (RL) for control has become increasingly popular due to its ability to learn feedback policies that can take into account complex representations of the environment and uncertainty. When considering safety constraints, constrained optimization approaches where agents are penalized for constraint violations are commonly used. In such methods, if agents are initialized in or must visit states where constraint violation might be inevitable, it is unclear if or how much they should be penalized. We address this challenge by formulating a constraint on the counterfactual harm of the learned policy compared to an alternate, safe policy. In a philosophical sense this method only penalizes the learner for constraint violations that it caused; in a practical sense it maintains feasibility of the optimal control problem when constraint violation is inevitable. We present simulation studies on a rover with uncertain road friction and a tractor-trailer parking environment that demonstrate our constraint formulation enables agents to learn safer policies than traditional constrained RL methods.
APA
Vaskov, S., Schwarting, W. & Baker, C.. (2024). Do no harm: A counterfactual approach to safe reinforcement learning. Proceedings of the 6th Annual Learning for Dynamics & Control Conference, in Proceedings of Machine Learning Research 242:1675-1687 Available from https://proceedings.mlr.press/v242/vaskov24a.html.

Related Material