Predictive and Causal Implications of using Shapley Value for Model Interpretation

Sisi Ma, Roshan Tourani
Proceedings of the 2020 KDD Workshop on Causal Discovery, PMLR 127:23-38, 2020.

Abstract

Shapley value is a concept from game theory. Recently, it has been used for explaining complex models produced by machine learning techniques. Although the mathematical definition of Shapley value is straight-forward, the implication of using it as a model inter- pretation tool is yet to be described. In the current paper, we analyzed Shapley value in the Bayesian network framework. We established the relationship between Shapley value and conditional independence, a key concept in both predictive and causal modeling. Our results indicate that, eliminating a variable with high Shapley value from a model do not necessarily impair predictive performance, whereas eliminating a variable with low Shapley value from a model could impair performance. Therefore, using Shapley value for feature selection do not result in the most parsimonious and predictively optimal model in the general case. More importantly, Shapley value of a variable do not reflect their causal relationship with the target of interest.

Cite this Paper


BibTeX
@InProceedings{pmlr-v127-ma20a, title = {Predictive and Causal Implications of using Shapley Value for Model Interpretation}, author = {Ma, Sisi and Tourani, Roshan}, booktitle = {Proceedings of the 2020 KDD Workshop on Causal Discovery}, pages = {23--38}, year = {2020}, editor = {}, volume = {127}, series = {Proceedings of Machine Learning Research}, month = {24 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v127/ma20a/ma20a.pdf}, url = {https://proceedings.mlr.press/v127/ma20a.html}, abstract = {Shapley value is a concept from game theory. Recently, it has been used for explaining complex models produced by machine learning techniques. Although the mathematical definition of Shapley value is straight-forward, the implication of using it as a model inter- pretation tool is yet to be described. In the current paper, we analyzed Shapley value in the Bayesian network framework. We established the relationship between Shapley value and conditional independence, a key concept in both predictive and causal modeling. Our results indicate that, eliminating a variable with high Shapley value from a model do not necessarily impair predictive performance, whereas eliminating a variable with low Shapley value from a model could impair performance. Therefore, using Shapley value for feature selection do not result in the most parsimonious and predictively optimal model in the general case. More importantly, Shapley value of a variable do not reflect their causal relationship with the target of interest.} }
Endnote
%0 Conference Paper %T Predictive and Causal Implications of using Shapley Value for Model Interpretation %A Sisi Ma %A Roshan Tourani %B Proceedings of the 2020 KDD Workshop on Causal Discovery %C Proceedings of Machine Learning Research %D 2020 %E %F pmlr-v127-ma20a %I PMLR %P 23--38 %U https://proceedings.mlr.press/v127/ma20a.html %V 127 %X Shapley value is a concept from game theory. Recently, it has been used for explaining complex models produced by machine learning techniques. Although the mathematical definition of Shapley value is straight-forward, the implication of using it as a model inter- pretation tool is yet to be described. In the current paper, we analyzed Shapley value in the Bayesian network framework. We established the relationship between Shapley value and conditional independence, a key concept in both predictive and causal modeling. Our results indicate that, eliminating a variable with high Shapley value from a model do not necessarily impair predictive performance, whereas eliminating a variable with low Shapley value from a model could impair performance. Therefore, using Shapley value for feature selection do not result in the most parsimonious and predictively optimal model in the general case. More importantly, Shapley value of a variable do not reflect their causal relationship with the target of interest.
APA
Ma, S. & Tourani, R.. (2020). Predictive and Causal Implications of using Shapley Value for Model Interpretation. Proceedings of the 2020 KDD Workshop on Causal Discovery, in Proceedings of Machine Learning Research 127:23-38 Available from https://proceedings.mlr.press/v127/ma20a.html.

Related Material