A Bayesian Approach to Robust Inverse Reinforcement Learning

Ran Wei, Siliang Zeng, Chenliang Li, Alfredo Garcia, Anthony D McDonald, Mingyi Hong
Proceedings of The 7th Conference on Robot Learning, PMLR 229:2304-2322, 2023.

Abstract

We consider a Bayesian approach to offline model-based inverse reinforcement learning (IRL). The proposed framework differs from existing offline model-based IRL approaches by performing simultaneous estimation of the expert’s reward function and subjective model of environment dynamics. We make use of a class of prior distributions which parameterizes how accurate the expert’s model of the environment is to develop efficient algorithms to estimate the expert’s reward and subjective dynamics in high-dimensional settings. Our analysis reveals a novel insight that the estimated policy exhibits robust performance when the expert is believed (a priori) to have a highly accurate model of the environment. We verify this observation in the MuJoCo environments and show that our algorithms outperform state-of-the-art offline IRL algorithms.

Cite this Paper


BibTeX
@InProceedings{pmlr-v229-wei23a, title = {A Bayesian Approach to Robust Inverse Reinforcement Learning}, author = {Wei, Ran and Zeng, Siliang and Li, Chenliang and Garcia, Alfredo and McDonald, Anthony D and Hong, Mingyi}, booktitle = {Proceedings of The 7th Conference on Robot Learning}, pages = {2304--2322}, year = {2023}, editor = {Tan, Jie and Toussaint, Marc and Darvish, Kourosh}, volume = {229}, series = {Proceedings of Machine Learning Research}, month = {06--09 Nov}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v229/wei23a/wei23a.pdf}, url = {https://proceedings.mlr.press/v229/wei23a.html}, abstract = {We consider a Bayesian approach to offline model-based inverse reinforcement learning (IRL). The proposed framework differs from existing offline model-based IRL approaches by performing simultaneous estimation of the expert’s reward function and subjective model of environment dynamics. We make use of a class of prior distributions which parameterizes how accurate the expert’s model of the environment is to develop efficient algorithms to estimate the expert’s reward and subjective dynamics in high-dimensional settings. Our analysis reveals a novel insight that the estimated policy exhibits robust performance when the expert is believed (a priori) to have a highly accurate model of the environment. We verify this observation in the MuJoCo environments and show that our algorithms outperform state-of-the-art offline IRL algorithms.} }
Endnote
%0 Conference Paper %T A Bayesian Approach to Robust Inverse Reinforcement Learning %A Ran Wei %A Siliang Zeng %A Chenliang Li %A Alfredo Garcia %A Anthony D McDonald %A Mingyi Hong %B Proceedings of The 7th Conference on Robot Learning %C Proceedings of Machine Learning Research %D 2023 %E Jie Tan %E Marc Toussaint %E Kourosh Darvish %F pmlr-v229-wei23a %I PMLR %P 2304--2322 %U https://proceedings.mlr.press/v229/wei23a.html %V 229 %X We consider a Bayesian approach to offline model-based inverse reinforcement learning (IRL). The proposed framework differs from existing offline model-based IRL approaches by performing simultaneous estimation of the expert’s reward function and subjective model of environment dynamics. We make use of a class of prior distributions which parameterizes how accurate the expert’s model of the environment is to develop efficient algorithms to estimate the expert’s reward and subjective dynamics in high-dimensional settings. Our analysis reveals a novel insight that the estimated policy exhibits robust performance when the expert is believed (a priori) to have a highly accurate model of the environment. We verify this observation in the MuJoCo environments and show that our algorithms outperform state-of-the-art offline IRL algorithms.
APA
Wei, R., Zeng, S., Li, C., Garcia, A., McDonald, A.D. & Hong, M.. (2023). A Bayesian Approach to Robust Inverse Reinforcement Learning. Proceedings of The 7th Conference on Robot Learning, in Proceedings of Machine Learning Research 229:2304-2322 Available from https://proceedings.mlr.press/v229/wei23a.html.

Related Material