Accurate Shapley Values for explaining tree-based models

Salim I. Amoukou, Tangi Salaün, Nicolas Brunel
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:2448-2465, 2022.

Abstract

Although Shapley Values (SV) are widely used in explainable AI, they can be poorly understood and estimated, implying that their analysis may lead to spurious inferences and explanations. As a starting point, we remind an invariance principle for SV and derive the correct approach for computing the SV of categorical variables that are particularly sensitive to the encoding used. In the case of tree-based models, we introduce two estimators of Shapley Values that exploit the tree structure efficiently and are more accurate than state-of-the-art methods. Simulations and comparisons are performed with state-of-the-art algorithms and show the practical gain of our approach. Finally, we discuss the ability of SV to provide reliable local explanations. We also provide a Python package that compute our estimators at https://github.com/salimamoukou/acv00.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-amoukou22a, title = { Accurate Shapley Values for explaining tree-based models }, author = {Amoukou, Salim I. and Sala\"un, Tangi and Brunel, Nicolas}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {2448--2465}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/amoukou22a/amoukou22a.pdf}, url = {https://proceedings.mlr.press/v151/amoukou22a.html}, abstract = { Although Shapley Values (SV) are widely used in explainable AI, they can be poorly understood and estimated, implying that their analysis may lead to spurious inferences and explanations. As a starting point, we remind an invariance principle for SV and derive the correct approach for computing the SV of categorical variables that are particularly sensitive to the encoding used. In the case of tree-based models, we introduce two estimators of Shapley Values that exploit the tree structure efficiently and are more accurate than state-of-the-art methods. Simulations and comparisons are performed with state-of-the-art algorithms and show the practical gain of our approach. Finally, we discuss the ability of SV to provide reliable local explanations. We also provide a Python package that compute our estimators at https://github.com/salimamoukou/acv00. } }
Endnote
%0 Conference Paper %T Accurate Shapley Values for explaining tree-based models %A Salim I. Amoukou %A Tangi Salaün %A Nicolas Brunel %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-amoukou22a %I PMLR %P 2448--2465 %U https://proceedings.mlr.press/v151/amoukou22a.html %V 151 %X Although Shapley Values (SV) are widely used in explainable AI, they can be poorly understood and estimated, implying that their analysis may lead to spurious inferences and explanations. As a starting point, we remind an invariance principle for SV and derive the correct approach for computing the SV of categorical variables that are particularly sensitive to the encoding used. In the case of tree-based models, we introduce two estimators of Shapley Values that exploit the tree structure efficiently and are more accurate than state-of-the-art methods. Simulations and comparisons are performed with state-of-the-art algorithms and show the practical gain of our approach. Finally, we discuss the ability of SV to provide reliable local explanations. We also provide a Python package that compute our estimators at https://github.com/salimamoukou/acv00.
APA
Amoukou, S.I., Salaün, T. & Brunel, N.. (2022). Accurate Shapley Values for explaining tree-based models . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:2448-2465 Available from https://proceedings.mlr.press/v151/amoukou22a.html.

Related Material