The Shapley Taylor Interaction Index

Mukund Sundararajan, Kedar Dhamdhere, Ashish Agarwal
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:9259-9268, 2020.

Abstract

The attribution problem, that is the problem of attributing a model’s prediction to its base features, is well-studied. We extend the notion of attribution to also apply to feature interactions. The Shapley value is a commonly used method to attribute a model’s prediction to its base features. We propose a generalization of the Shapley value called Shapley-Taylor index that attributes the model’s prediction to interactions of subsets of features up to some size $k$. The method is analogous to how the truncated Taylor Series decomposes the function value at a certain point using its derivatives at a different point. In fact, we show that the Shapley Taylor index is equal to the Taylor Series of the multilinear extension of the set-theoretic behavior of the model. We axiomatize this method using the standard Shapley axioms—linearity, dummy, symmetry and efficiency—and an additional axiom that we call the interaction distribution axiom. This new axiom explicitly characterizes how interactions are distributed for a class of functions that model pure interaction. We contrast the Shapley-Taylor index against the previously proposed Shapley Interaction index from the cooperative game theory literature. We also apply the Shapley Taylor index to three models and identify interesting qualitative insights.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-sundararajan20a, title = {The Shapley Taylor Interaction Index}, author = {Sundararajan, Mukund and Dhamdhere, Kedar and Agarwal, Ashish}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {9259--9268}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/sundararajan20a/sundararajan20a.pdf}, url = {https://proceedings.mlr.press/v119/sundararajan20a.html}, abstract = {The attribution problem, that is the problem of attributing a model’s prediction to its base features, is well-studied. We extend the notion of attribution to also apply to feature interactions. The Shapley value is a commonly used method to attribute a model’s prediction to its base features. We propose a generalization of the Shapley value called Shapley-Taylor index that attributes the model’s prediction to interactions of subsets of features up to some size $k$. The method is analogous to how the truncated Taylor Series decomposes the function value at a certain point using its derivatives at a different point. In fact, we show that the Shapley Taylor index is equal to the Taylor Series of the multilinear extension of the set-theoretic behavior of the model. We axiomatize this method using the standard Shapley axioms—linearity, dummy, symmetry and efficiency—and an additional axiom that we call the interaction distribution axiom. This new axiom explicitly characterizes how interactions are distributed for a class of functions that model pure interaction. We contrast the Shapley-Taylor index against the previously proposed Shapley Interaction index from the cooperative game theory literature. We also apply the Shapley Taylor index to three models and identify interesting qualitative insights.} }
Endnote
%0 Conference Paper %T The Shapley Taylor Interaction Index %A Mukund Sundararajan %A Kedar Dhamdhere %A Ashish Agarwal %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-sundararajan20a %I PMLR %P 9259--9268 %U https://proceedings.mlr.press/v119/sundararajan20a.html %V 119 %X The attribution problem, that is the problem of attributing a model’s prediction to its base features, is well-studied. We extend the notion of attribution to also apply to feature interactions. The Shapley value is a commonly used method to attribute a model’s prediction to its base features. We propose a generalization of the Shapley value called Shapley-Taylor index that attributes the model’s prediction to interactions of subsets of features up to some size $k$. The method is analogous to how the truncated Taylor Series decomposes the function value at a certain point using its derivatives at a different point. In fact, we show that the Shapley Taylor index is equal to the Taylor Series of the multilinear extension of the set-theoretic behavior of the model. We axiomatize this method using the standard Shapley axioms—linearity, dummy, symmetry and efficiency—and an additional axiom that we call the interaction distribution axiom. This new axiom explicitly characterizes how interactions are distributed for a class of functions that model pure interaction. We contrast the Shapley-Taylor index against the previously proposed Shapley Interaction index from the cooperative game theory literature. We also apply the Shapley Taylor index to three models and identify interesting qualitative insights.
APA
Sundararajan, M., Dhamdhere, K. & Agarwal, A.. (2020). The Shapley Taylor Interaction Index. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:9259-9268 Available from https://proceedings.mlr.press/v119/sundararajan20a.html.

Related Material