Explaining Reinforcement Learning with Shapley Values

Daniel Beechey, Thomas M. S. Smith, Özgür Şimşek
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:2003-2014, 2023.

Abstract

For reinforcement learning systems to be widely adopted, their users must understand and trust them. We present a theoretical analysis of explaining reinforcement learning using Shapley values, following a principled approach from game theory for identifying the contribution of individual players to the outcome of a cooperative game. We call this general framework Shapley Values for Explaining Reinforcement Learning (SVERL). Our analysis exposes the limitations of earlier uses of Shapley values in reinforcement learning. We then develop an approach that uses Shapley values to explain agent performance. In a variety of domains, SVERL produces meaningful explanations that match and supplement human intuition.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-beechey23a, title = {Explaining Reinforcement Learning with Shapley Values}, author = {Beechey, Daniel and Smith, Thomas M. S. and \c{S}im\c{s}ek, \"{O}zg\"{u}r}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {2003--2014}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/beechey23a/beechey23a.pdf}, url = {https://proceedings.mlr.press/v202/beechey23a.html}, abstract = {For reinforcement learning systems to be widely adopted, their users must understand and trust them. We present a theoretical analysis of explaining reinforcement learning using Shapley values, following a principled approach from game theory for identifying the contribution of individual players to the outcome of a cooperative game. We call this general framework Shapley Values for Explaining Reinforcement Learning (SVERL). Our analysis exposes the limitations of earlier uses of Shapley values in reinforcement learning. We then develop an approach that uses Shapley values to explain agent performance. In a variety of domains, SVERL produces meaningful explanations that match and supplement human intuition.} }
Endnote
%0 Conference Paper %T Explaining Reinforcement Learning with Shapley Values %A Daniel Beechey %A Thomas M. S. Smith %A Özgür Şimşek %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-beechey23a %I PMLR %P 2003--2014 %U https://proceedings.mlr.press/v202/beechey23a.html %V 202 %X For reinforcement learning systems to be widely adopted, their users must understand and trust them. We present a theoretical analysis of explaining reinforcement learning using Shapley values, following a principled approach from game theory for identifying the contribution of individual players to the outcome of a cooperative game. We call this general framework Shapley Values for Explaining Reinforcement Learning (SVERL). Our analysis exposes the limitations of earlier uses of Shapley values in reinforcement learning. We then develop an approach that uses Shapley values to explain agent performance. In a variety of domains, SVERL produces meaningful explanations that match and supplement human intuition.
APA
Beechey, D., Smith, T.M.S. & Şimşek, Ö.. (2023). Explaining Reinforcement Learning with Shapley Values. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:2003-2014 Available from https://proceedings.mlr.press/v202/beechey23a.html.

Related Material