Urat: Universal regularized adversarial training in robust reinforcement learning

Jingtang Chen, Haoxiang Chen, Zilin Niu, Yi Zhu
Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing, PMLR 278:461-475, 2025.

Abstract

With the increasing maturity of reinforcement learning (RL)technology, its application areas have been widely expanded to several cutting-edge scientific fields, such as artificial intelligence, robotics, intelligent manufacturing, self-driving cars, and cognitive computing. However, the complexity and uncertainty of the real world pose serious challenges to the stability of RL models. For example, in the field of autonomous driving, unpredictable road conditions and variable weather conditions can adversely affect the decision-making process of intelligent driving algorithms, leading them to make irrational decisions. To address this problem, this study proposes a training method called Universal Regularized Adversarial Training in Robust Reinforcement Learning (Urat), which aims to enhance the robustness of the robustness of DRL strategies against potential adversarial attacks. In this study, we introduce a powerful attacker for targeted adversarial training of DRL intelligence. In addition, we innovatively incorporate a robust strategy regularizer into the algorithm to facilitate the learning of strategies by intelligences that can effectively defend against various attacks. The methods in this study have been tested adversarially in several OpenAI Gym environments, including HalfCheetah-v4, Swimmer-v4, and Arcbot-vl.The test results show that the Urat training method can effectively improve the robustness of DRL strategies and achieve robust performance in complex and uncertain environments. This research result not only provides a new perspective in the field of reinforcement learning but also provides theoretical support and technical guarantee for intelligent decision-making in practical application scenarios such as autonomous driving.

Cite this Paper


BibTeX
@InProceedings{pmlr-v278-chen25b, title = {Urat: Universal regularized adversarial training in robust reinforcement learning}, author = {Chen, Jingtang and Chen, Haoxiang and Niu, Zilin and Zhu, Yi}, booktitle = {Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing}, pages = {461--475}, year = {2025}, editor = {Zeng, Nianyin and Pachori, Ram Bilas and Wang, Dongshu}, volume = {278}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v278/main/assets/chen25b/chen25b.pdf}, url = {https://proceedings.mlr.press/v278/chen25b.html}, abstract = {With the increasing maturity of reinforcement learning (RL)technology, its application areas have been widely expanded to several cutting-edge scientific fields, such as artificial intelligence, robotics, intelligent manufacturing, self-driving cars, and cognitive computing. However, the complexity and uncertainty of the real world pose serious challenges to the stability of RL models. For example, in the field of autonomous driving, unpredictable road conditions and variable weather conditions can adversely affect the decision-making process of intelligent driving algorithms, leading them to make irrational decisions. To address this problem, this study proposes a training method called Universal Regularized Adversarial Training in Robust Reinforcement Learning (Urat), which aims to enhance the robustness of the robustness of DRL strategies against potential adversarial attacks. In this study, we introduce a powerful attacker for targeted adversarial training of DRL intelligence. In addition, we innovatively incorporate a robust strategy regularizer into the algorithm to facilitate the learning of strategies by intelligences that can effectively defend against various attacks. The methods in this study have been tested adversarially in several OpenAI Gym environments, including HalfCheetah-v4, Swimmer-v4, and Arcbot-vl.The test results show that the Urat training method can effectively improve the robustness of DRL strategies and achieve robust performance in complex and uncertain environments. This research result not only provides a new perspective in the field of reinforcement learning but also provides theoretical support and technical guarantee for intelligent decision-making in practical application scenarios such as autonomous driving.} }
Endnote
%0 Conference Paper %T Urat: Universal regularized adversarial training in robust reinforcement learning %A Jingtang Chen %A Haoxiang Chen %A Zilin Niu %A Yi Zhu %B Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing %C Proceedings of Machine Learning Research %D 2025 %E Nianyin Zeng %E Ram Bilas Pachori %E Dongshu Wang %F pmlr-v278-chen25b %I PMLR %P 461--475 %U https://proceedings.mlr.press/v278/chen25b.html %V 278 %X With the increasing maturity of reinforcement learning (RL)technology, its application areas have been widely expanded to several cutting-edge scientific fields, such as artificial intelligence, robotics, intelligent manufacturing, self-driving cars, and cognitive computing. However, the complexity and uncertainty of the real world pose serious challenges to the stability of RL models. For example, in the field of autonomous driving, unpredictable road conditions and variable weather conditions can adversely affect the decision-making process of intelligent driving algorithms, leading them to make irrational decisions. To address this problem, this study proposes a training method called Universal Regularized Adversarial Training in Robust Reinforcement Learning (Urat), which aims to enhance the robustness of the robustness of DRL strategies against potential adversarial attacks. In this study, we introduce a powerful attacker for targeted adversarial training of DRL intelligence. In addition, we innovatively incorporate a robust strategy regularizer into the algorithm to facilitate the learning of strategies by intelligences that can effectively defend against various attacks. The methods in this study have been tested adversarially in several OpenAI Gym environments, including HalfCheetah-v4, Swimmer-v4, and Arcbot-vl.The test results show that the Urat training method can effectively improve the robustness of DRL strategies and achieve robust performance in complex and uncertain environments. This research result not only provides a new perspective in the field of reinforcement learning but also provides theoretical support and technical guarantee for intelligent decision-making in practical application scenarios such as autonomous driving.
APA
Chen, J., Chen, H., Niu, Z. & Zhu, Y.. (2025). Urat: Universal regularized adversarial training in robust reinforcement learning. Proceedings of 2025 2nd International Conference on Machine Learning and Intelligent Computing, in Proceedings of Machine Learning Research 278:461-475 Available from https://proceedings.mlr.press/v278/chen25b.html.

Related Material