Robust Policy Learning over Multiple Uncertainty Sets

Annie Xie, Shagun Sodhani, Chelsea Finn, Joelle Pineau, Amy Zhang
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:24414-24429, 2022.

Abstract

Reinforcement learning (RL) agents need to be robust to variations in safety-critical environments. While system identification methods provide a way to infer the variation from online experience, they can fail in settings where fast identification is not possible. Another dominant approach is robust RL which produces a policy that can handle worst-case scenarios, but these methods are generally designed to achieve robustness to a single uncertainty set that must be specified at train time. Towards a more general solution, we formulate the multi-set robustness problem to learn a policy robust to different perturbation sets. We then design an algorithm that enjoys the benefits of both system identification and robust RL: it reduces uncertainty where possible given a few interactions, but can still act robustly with respect to the remaining uncertainty. On a diverse set of control tasks, our approach demonstrates improved worst-case performance on new environments compared to prior methods based on system identification and on robust RL alone.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-xie22c, title = {Robust Policy Learning over Multiple Uncertainty Sets}, author = {Xie, Annie and Sodhani, Shagun and Finn, Chelsea and Pineau, Joelle and Zhang, Amy}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {24414--24429}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/xie22c/xie22c.pdf}, url = {https://proceedings.mlr.press/v162/xie22c.html}, abstract = {Reinforcement learning (RL) agents need to be robust to variations in safety-critical environments. While system identification methods provide a way to infer the variation from online experience, they can fail in settings where fast identification is not possible. Another dominant approach is robust RL which produces a policy that can handle worst-case scenarios, but these methods are generally designed to achieve robustness to a single uncertainty set that must be specified at train time. Towards a more general solution, we formulate the multi-set robustness problem to learn a policy robust to different perturbation sets. We then design an algorithm that enjoys the benefits of both system identification and robust RL: it reduces uncertainty where possible given a few interactions, but can still act robustly with respect to the remaining uncertainty. On a diverse set of control tasks, our approach demonstrates improved worst-case performance on new environments compared to prior methods based on system identification and on robust RL alone.} }
Endnote
%0 Conference Paper %T Robust Policy Learning over Multiple Uncertainty Sets %A Annie Xie %A Shagun Sodhani %A Chelsea Finn %A Joelle Pineau %A Amy Zhang %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-xie22c %I PMLR %P 24414--24429 %U https://proceedings.mlr.press/v162/xie22c.html %V 162 %X Reinforcement learning (RL) agents need to be robust to variations in safety-critical environments. While system identification methods provide a way to infer the variation from online experience, they can fail in settings where fast identification is not possible. Another dominant approach is robust RL which produces a policy that can handle worst-case scenarios, but these methods are generally designed to achieve robustness to a single uncertainty set that must be specified at train time. Towards a more general solution, we formulate the multi-set robustness problem to learn a policy robust to different perturbation sets. We then design an algorithm that enjoys the benefits of both system identification and robust RL: it reduces uncertainty where possible given a few interactions, but can still act robustly with respect to the remaining uncertainty. On a diverse set of control tasks, our approach demonstrates improved worst-case performance on new environments compared to prior methods based on system identification and on robust RL alone.
APA
Xie, A., Sodhani, S., Finn, C., Pineau, J. & Zhang, A.. (2022). Robust Policy Learning over Multiple Uncertainty Sets. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:24414-24429 Available from https://proceedings.mlr.press/v162/xie22c.html.

Related Material