Probabilistic Safeguard for Reinforcement Learning Using Safety Index Guided Gaussian Process Models

Weiye Zhao; Tairan He; Changliu Liu

Probabilistic Safeguard for Reinforcement Learning Using Safety Index Guided Gaussian Process Models

Weiye Zhao, Tairan He, Changliu Liu

Proceedings of The 5th Annual Learning for Dynamics and Control Conference, PMLR 211:783-796, 2023.

Abstract

Safety is one of the biggest concerns to applying reinforcement learning (RL) to the physical world. In its core part, it is challenging to ensure RL agents persistently satisfy a hard state constraint without white-box or black-box dynamics models. This paper presents an integrated model learning and safe control framework to safeguard any RL agent, where the environment dynamics are learned as Gaussian processes. The proposed theory provides (i) a novel method to construct an offline dataset for model learning that best achieves safety requirements; (ii) a design rule to construct the safety index to ensure the existence of safe control under control limits; (iii) a probablistic safety guarantee (i.e. probabilistic forward invariance) when the model is learned using the aforementioned dataset. Simulation results show that our framework achieves almost zero safety violation on various continuous control tasks.

Cite this Paper

BibTeX


@InProceedings{pmlr-v211-zhao23a,
  title = 	 {Probabilistic Safeguard for Reinforcement Learning Using Safety Index Guided Gaussian Process Models},
  author =       {Zhao, Weiye and He, Tairan and Liu, Changliu},
  booktitle = 	 {Proceedings of The 5th Annual Learning for Dynamics and Control Conference},
  pages = 	 {783--796},
  year = 	 {2023},
  editor = 	 {Matni, Nikolai and Morari, Manfred and Pappas, George J.},
  volume = 	 {211},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {15--16 Jun},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v211/zhao23a/zhao23a.pdf},
  url = 	 {https://proceedings.mlr.press/v211/zhao23a.html},
  abstract = 	 {Safety is one of the biggest concerns to applying reinforcement learning (RL) to the physical world. In its core part, it is challenging to ensure RL agents persistently satisfy a hard state constraint without white-box or black-box dynamics models. This paper presents an integrated model learning and safe control framework to safeguard any RL agent, where the environment dynamics are learned as Gaussian processes. The proposed theory provides (i) a novel method to construct an offline dataset for model learning that best achieves safety requirements; (ii) a design rule to construct the safety index to ensure the existence of safe control under control limits; (iii) a probablistic safety guarantee (i.e. probabilistic forward invariance) when the model is learned using the aforementioned dataset. Simulation results show that our framework achieves almost zero safety violation on various continuous control tasks.}
}

Endnote

%0 Conference Paper
%T Probabilistic Safeguard for Reinforcement Learning Using Safety Index Guided Gaussian Process Models
%A Weiye Zhao
%A Tairan He
%A Changliu Liu
%B Proceedings of The 5th Annual Learning for Dynamics and Control Conference
%C Proceedings of Machine Learning Research
%D 2023
%E Nikolai Matni
%E Manfred Morari
%E George J. Pappas	
%F pmlr-v211-zhao23a
%I PMLR
%P 783--796
%U https://proceedings.mlr.press/v211/zhao23a.html
%V 211
%X Safety is one of the biggest concerns to applying reinforcement learning (RL) to the physical world. In its core part, it is challenging to ensure RL agents persistently satisfy a hard state constraint without white-box or black-box dynamics models. This paper presents an integrated model learning and safe control framework to safeguard any RL agent, where the environment dynamics are learned as Gaussian processes. The proposed theory provides (i) a novel method to construct an offline dataset for model learning that best achieves safety requirements; (ii) a design rule to construct the safety index to ensure the existence of safe control under control limits; (iii) a probablistic safety guarantee (i.e. probabilistic forward invariance) when the model is learned using the aforementioned dataset. Simulation results show that our framework achieves almost zero safety violation on various continuous control tasks.

APA


Zhao, W., He, T. & Liu, C.. (2023). Probabilistic Safeguard for Reinforcement Learning Using Safety Index Guided Gaussian Process Models. Proceedings of The 5th Annual Learning for Dynamics and Control Conference, in Proceedings of Machine Learning Research 211:783-796 Available from https://proceedings.mlr.press/v211/zhao23a.html.

Probabilistic Safeguard for Reinforcement Learning Using Safety Index Guided Gaussian Process Models

Abstract

Cite this Paper

Related Material