Training Characteristic Functions with Reinforcement Learning: XAI-methods play Connect Four

Stephan Wäldchen, Sebastian Pokutta, Felix Huber
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:22457-22474, 2022.

Abstract

Characteristic functions (from cooperative game theory) are able to evaluate partial inputs and form the basis for attribution methods like Shapley values. These attribution methods allow us to measure how important each input component is for the function output—one of the goals of explainable AI (XAI). Given a standard classifier function, it is unclear how partial input should be realised. Instead, most XAI-methods for black-box classifiers like neural networks consider counterfactual inputs that generally lie off-manifold, which makes them hard to evaluate and easy to manipulate. We propose a setup to directly train characteristic functions in the form of neural networks to play simple two-player games. We apply this to the game of Connect Four by randomly hiding colour information from our agents during training. This has three advantages for comparing XAI-methods: It alleviates the ambiguity about how to realise partial input, makes off-manifold evaluation unnecessary and allows us to compare the methods by letting them play against each other.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-waldchen22a, title = {Training Characteristic Functions with Reinforcement Learning: {XAI}-methods play Connect Four}, author = {W{\"a}ldchen, Stephan and Pokutta, Sebastian and Huber, Felix}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {22457--22474}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/waldchen22a/waldchen22a.pdf}, url = {https://proceedings.mlr.press/v162/waldchen22a.html}, abstract = {Characteristic functions (from cooperative game theory) are able to evaluate partial inputs and form the basis for attribution methods like Shapley values. These attribution methods allow us to measure how important each input component is for the function output—one of the goals of explainable AI (XAI). Given a standard classifier function, it is unclear how partial input should be realised. Instead, most XAI-methods for black-box classifiers like neural networks consider counterfactual inputs that generally lie off-manifold, which makes them hard to evaluate and easy to manipulate. We propose a setup to directly train characteristic functions in the form of neural networks to play simple two-player games. We apply this to the game of Connect Four by randomly hiding colour information from our agents during training. This has three advantages for comparing XAI-methods: It alleviates the ambiguity about how to realise partial input, makes off-manifold evaluation unnecessary and allows us to compare the methods by letting them play against each other.} }
Endnote
%0 Conference Paper %T Training Characteristic Functions with Reinforcement Learning: XAI-methods play Connect Four %A Stephan Wäldchen %A Sebastian Pokutta %A Felix Huber %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-waldchen22a %I PMLR %P 22457--22474 %U https://proceedings.mlr.press/v162/waldchen22a.html %V 162 %X Characteristic functions (from cooperative game theory) are able to evaluate partial inputs and form the basis for attribution methods like Shapley values. These attribution methods allow us to measure how important each input component is for the function output—one of the goals of explainable AI (XAI). Given a standard classifier function, it is unclear how partial input should be realised. Instead, most XAI-methods for black-box classifiers like neural networks consider counterfactual inputs that generally lie off-manifold, which makes them hard to evaluate and easy to manipulate. We propose a setup to directly train characteristic functions in the form of neural networks to play simple two-player games. We apply this to the game of Connect Four by randomly hiding colour information from our agents during training. This has three advantages for comparing XAI-methods: It alleviates the ambiguity about how to realise partial input, makes off-manifold evaluation unnecessary and allows us to compare the methods by letting them play against each other.
APA
Wäldchen, S., Pokutta, S. & Huber, F.. (2022). Training Characteristic Functions with Reinforcement Learning: XAI-methods play Connect Four. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:22457-22474 Available from https://proceedings.mlr.press/v162/waldchen22a.html.

Related Material