Model-Free Robust Reinforcement Learning with Sample Complexity Analysis

Yudan Wang; Shaofeng Zou; Yue Wang

Model-Free Robust Reinforcement Learning with Sample Complexity Analysis

Yudan Wang, Shaofeng Zou, Yue Wang

Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, PMLR 244:3470-3513, 2024.

Abstract

Distributionally Robust Reinforcement Learning (DR-RL) aims to derive a policy optimizing the worst-case performance within a predefined uncertainty set. Despite extensive research, previous DR-RL algorithms have predominantly favored model-based approaches, with limited availability of model-free methods offering convergence guarantees or sample complexities. This paper proposes a model-free DR-RL algorithm leveraging the Multi-level Monte Carlo (MLMC) technique to close such a gap. Our innovative approach integrates a threshold mechanism that ensures finite sample requirements for algorithmic implementation, a significant departure from previous model-free algorithms. We adapt our algorithm to accommodate uncertainty sets defined by total variation, Chi-square divergence, and KL divergence, and provide finite sample analyses under all three cases. Remarkably, our algorithms represent the first model-free DR-RL approach featuring finite sample complexity for total variation and Chi-square divergence uncertainty sets, while also offering an improved sample complexity and broader applicability compared to existing model-free DR-RL algorithms for the KL divergence model. The complexities of our method establish the tightest results for all three uncertainty models in model-free DR-RL, underscoring the effectiveness and efficiency of our algorithm, and highlighting its potential for practical applications.

Cite this Paper

BibTeX

@InProceedings{pmlr-v244-wang24a,
  title = 	 {Model-Free Robust Reinforcement Learning with Sample Complexity Analysis},
  author =       {Wang, Yudan and Zou, Shaofeng and Wang, Yue},
  booktitle = 	 {Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence},
  pages = 	 {3470--3513},
  year = 	 {2024},
  editor = 	 {Kiyavash, Negar and Mooij, Joris M.},
  volume = 	 {244},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {15--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v244/main/assets/wang24a/wang24a.pdf},
  url = 	 {https://proceedings.mlr.press/v244/wang24a.html},
  abstract = 	 {Distributionally Robust Reinforcement Learning (DR-RL) aims to derive a policy optimizing the worst-case performance within a predefined uncertainty set. Despite extensive research, previous DR-RL algorithms have predominantly favored model-based approaches, with limited availability of model-free methods offering convergence guarantees or sample complexities. This paper proposes a model-free DR-RL algorithm leveraging the Multi-level Monte Carlo (MLMC) technique to close such a gap. Our innovative approach integrates a threshold mechanism that ensures finite sample requirements for algorithmic implementation, a significant departure from previous model-free algorithms. We adapt our algorithm to accommodate uncertainty sets defined by total variation, Chi-square divergence, and KL divergence, and provide finite sample analyses under all three cases. Remarkably, our algorithms represent the first model-free DR-RL approach featuring finite sample complexity for total variation and Chi-square divergence uncertainty sets, while also offering an improved sample complexity and broader applicability compared to existing model-free DR-RL algorithms for the KL divergence model. The complexities of our method establish the tightest results for all three uncertainty models in model-free DR-RL, underscoring the effectiveness and efficiency of our algorithm, and highlighting its potential for practical applications.}
}

Endnote

%0 Conference Paper
%T Model-Free Robust Reinforcement Learning with Sample Complexity Analysis
%A Yudan Wang
%A Shaofeng Zou
%A Yue Wang
%B Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence
%C Proceedings of Machine Learning Research
%D 2024
%E Negar Kiyavash
%E Joris M. Mooij	
%F pmlr-v244-wang24a
%I PMLR
%P 3470--3513
%U https://proceedings.mlr.press/v244/wang24a.html
%V 244
%X Distributionally Robust Reinforcement Learning (DR-RL) aims to derive a policy optimizing the worst-case performance within a predefined uncertainty set. Despite extensive research, previous DR-RL algorithms have predominantly favored model-based approaches, with limited availability of model-free methods offering convergence guarantees or sample complexities. This paper proposes a model-free DR-RL algorithm leveraging the Multi-level Monte Carlo (MLMC) technique to close such a gap. Our innovative approach integrates a threshold mechanism that ensures finite sample requirements for algorithmic implementation, a significant departure from previous model-free algorithms. We adapt our algorithm to accommodate uncertainty sets defined by total variation, Chi-square divergence, and KL divergence, and provide finite sample analyses under all three cases. Remarkably, our algorithms represent the first model-free DR-RL approach featuring finite sample complexity for total variation and Chi-square divergence uncertainty sets, while also offering an improved sample complexity and broader applicability compared to existing model-free DR-RL algorithms for the KL divergence model. The complexities of our method establish the tightest results for all three uncertainty models in model-free DR-RL, underscoring the effectiveness and efficiency of our algorithm, and highlighting its potential for practical applications.

APA

Wang, Y., Zou, S. & Wang, Y.. (2024). Model-Free Robust Reinforcement Learning with Sample Complexity Analysis. Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence, in Proceedings of Machine Learning Research 244:3470-3513 Available from https://proceedings.mlr.press/v244/wang24a.html.

Model-Free Robust Reinforcement Learning with Sample Complexity Analysis

Abstract

Cite this Paper

Related Material