Influence Decompositions For Neural Network Attribution

Kyle Reing; Greg Ver Steeg; Aram Galstyan

Influence Decompositions For Neural Network Attribution

Kyle Reing, Greg Ver Steeg, Aram Galstyan

Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:2710-2718, 2021.

Abstract

Methods of neural network attribution have emerged out of a necessity for explanation and accountability in the predictions of black-box neural models. Most approaches use a variation of sensitivity analysis, where individual input variables are perturbed and the downstream effects on some output metric are measured. We demonstrate that a number of critical functional properties are not revealed when only considering lower-order perturbations. Motivated by these shortcomings, we propose a general framework for decomposing the orders of influence that a collection of input variables has on an output classification. These orders are based on the cardinality of input subsets which are perturbed to yield a change in classification. This decomposition can be naturally applied to attribute which input variables rely on higher-order coordination to impact the classification decision. We demonstrate that our approach correctly identifies higher-order attribution on a number of synthetic examples. Additionally, we showcase the differences between attribution in our approach and existing approaches on benchmark networks for MNIST and ImageNet.

Cite this Paper

BibTeX


@InProceedings{pmlr-v130-reing21a,
  title = 	 { Influence Decompositions For Neural Network Attribution },
  author =       {Reing, Kyle and Ver Steeg, Greg and Galstyan, Aram},
  booktitle = 	 {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {2710--2718},
  year = 	 {2021},
  editor = 	 {Banerjee, Arindam and Fukumizu, Kenji},
  volume = 	 {130},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--15 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v130/reing21a/reing21a.pdf},
  url = 	 {https://proceedings.mlr.press/v130/reing21a.html},
  abstract = 	 { Methods of neural network attribution have emerged out of a necessity for explanation and accountability in the predictions of black-box neural models. Most approaches use a variation of sensitivity analysis, where individual input variables are perturbed and the downstream effects on some output metric are measured. We demonstrate that a number of critical functional properties are not revealed when only considering lower-order perturbations. Motivated by these shortcomings, we propose a general framework for decomposing the orders of influence that a collection of input variables has on an output classification. These orders are based on the cardinality of input subsets which are perturbed to yield a change in classification. This decomposition can be naturally applied to attribute which input variables rely on higher-order coordination to impact the classification decision. We demonstrate that our approach correctly identifies higher-order attribution on a number of synthetic examples. Additionally, we showcase the differences between attribution in our approach and existing approaches on benchmark networks for MNIST and ImageNet. }
}

Endnote

%0 Conference Paper
%T  Influence Decompositions For Neural Network Attribution 
%A Kyle Reing
%A Greg Ver Steeg
%A Aram Galstyan
%B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2021
%E Arindam Banerjee
%E Kenji Fukumizu	
%F pmlr-v130-reing21a
%I PMLR
%P 2710--2718
%U https://proceedings.mlr.press/v130/reing21a.html
%V 130
%X  Methods of neural network attribution have emerged out of a necessity for explanation and accountability in the predictions of black-box neural models. Most approaches use a variation of sensitivity analysis, where individual input variables are perturbed and the downstream effects on some output metric are measured. We demonstrate that a number of critical functional properties are not revealed when only considering lower-order perturbations. Motivated by these shortcomings, we propose a general framework for decomposing the orders of influence that a collection of input variables has on an output classification. These orders are based on the cardinality of input subsets which are perturbed to yield a change in classification. This decomposition can be naturally applied to attribute which input variables rely on higher-order coordination to impact the classification decision. We demonstrate that our approach correctly identifies higher-order attribution on a number of synthetic examples. Additionally, we showcase the differences between attribution in our approach and existing approaches on benchmark networks for MNIST and ImageNet.

APA


Reing, K., Ver Steeg, G. & Galstyan, A.. (2021).  Influence Decompositions For Neural Network Attribution . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:2710-2718 Available from https://proceedings.mlr.press/v130/reing21a.html.

Influence Decompositions For Neural Network Attribution

Abstract

Cite this Paper

Related Material