The Many Shapley Values for Model Explanation

Mukund Sundararajan, Amir Najmi
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:9269-9278, 2020.

Abstract

The Shapley value has become the basis for several methods that attribute the prediction of a machine-learning model on an input to its base features. The use of the Shapley value is justified by citing the uniqueness result from \cite{Shapley53}, which shows that it is the only method that satisfies certain good properties (\emph{axioms}). There are, however, a multiplicity of ways in which the Shapley value is operationalized for model explanation. These differ in how they reference the model, the training data, and the explanation context. Hence they differ in output, rendering the uniqueness result inapplicable. Furthermore, the techniques that rely on they training data produce non-intuitive attributions, for instance unused features can still receive attribution. In this paper, we use the axiomatic approach to study the differences between some of the many operationalizations of the Shapley value for attribution. We discuss a technique called Baseline Shapley (BShap), provide a proper uniqueness result for it, and contrast it with two other techniques from prior literature, Integrated Gradients \cite{STY17} and Conditional Expectation Shapley \cite{Lundberg2017AUA}.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-sundararajan20b, title = {The Many Shapley Values for Model Explanation}, author = {Sundararajan, Mukund and Najmi, Amir}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {9269--9278}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/sundararajan20b/sundararajan20b.pdf}, url = {https://proceedings.mlr.press/v119/sundararajan20b.html}, abstract = {The Shapley value has become the basis for several methods that attribute the prediction of a machine-learning model on an input to its base features. The use of the Shapley value is justified by citing the uniqueness result from \cite{Shapley53}, which shows that it is the only method that satisfies certain good properties (\emph{axioms}). There are, however, a multiplicity of ways in which the Shapley value is operationalized for model explanation. These differ in how they reference the model, the training data, and the explanation context. Hence they differ in output, rendering the uniqueness result inapplicable. Furthermore, the techniques that rely on they training data produce non-intuitive attributions, for instance unused features can still receive attribution. In this paper, we use the axiomatic approach to study the differences between some of the many operationalizations of the Shapley value for attribution. We discuss a technique called Baseline Shapley (BShap), provide a proper uniqueness result for it, and contrast it with two other techniques from prior literature, Integrated Gradients \cite{STY17} and Conditional Expectation Shapley \cite{Lundberg2017AUA}.} }
Endnote
%0 Conference Paper %T The Many Shapley Values for Model Explanation %A Mukund Sundararajan %A Amir Najmi %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-sundararajan20b %I PMLR %P 9269--9278 %U https://proceedings.mlr.press/v119/sundararajan20b.html %V 119 %X The Shapley value has become the basis for several methods that attribute the prediction of a machine-learning model on an input to its base features. The use of the Shapley value is justified by citing the uniqueness result from \cite{Shapley53}, which shows that it is the only method that satisfies certain good properties (\emph{axioms}). There are, however, a multiplicity of ways in which the Shapley value is operationalized for model explanation. These differ in how they reference the model, the training data, and the explanation context. Hence they differ in output, rendering the uniqueness result inapplicable. Furthermore, the techniques that rely on they training data produce non-intuitive attributions, for instance unused features can still receive attribution. In this paper, we use the axiomatic approach to study the differences between some of the many operationalizations of the Shapley value for attribution. We discuss a technique called Baseline Shapley (BShap), provide a proper uniqueness result for it, and contrast it with two other techniques from prior literature, Integrated Gradients \cite{STY17} and Conditional Expectation Shapley \cite{Lundberg2017AUA}.
APA
Sundararajan, M. & Najmi, A.. (2020). The Many Shapley Values for Model Explanation. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:9269-9278 Available from https://proceedings.mlr.press/v119/sundararajan20b.html.

Related Material