Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation

Sahil Singla, Eric Wallace, Shi Feng, Soheil Feizi
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:5848-5856, 2019.

Abstract

Current saliency map interpretations for neural networks generally rely on two key assumptions. First, they use first-order approximations of the loss function, neglecting higher-order terms such as the loss curvature. Second, they evaluate each feature’s importance in isolation, ignoring feature interdependencies. This work studies the effect of relaxing these two assumptions. First, we characterize a closed-form formula for the input Hessian matrix of a deep ReLU network. Using this formula, we show that, for classification problems with many classes, if a prediction has high probability then including the Hessian term has a small impact on the interpretation. We prove this result by demonstrating that these conditions cause the Hessian matrix to be approximately rank one and its leading eigenvector to be almost parallel to the gradient of the loss. We empirically validate this theory by interpreting ImageNet classifiers. Second, we incorporate feature interdependencies by calculating the importance of group-features using a sparsity regularization term. We use an L0 - L1 relaxation technique along with proximal gradient descent to efficiently compute group-feature importance values. Our empirical results show that our method significantly improves deep learning interpretations.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-singla19a, title = {Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation}, author = {Singla, Sahil and Wallace, Eric and Feng, Shi and Feizi, Soheil}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {5848--5856}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/singla19a/singla19a.pdf}, url = {https://proceedings.mlr.press/v97/singla19a.html}, abstract = {Current saliency map interpretations for neural networks generally rely on two key assumptions. First, they use first-order approximations of the loss function, neglecting higher-order terms such as the loss curvature. Second, they evaluate each feature’s importance in isolation, ignoring feature interdependencies. This work studies the effect of relaxing these two assumptions. First, we characterize a closed-form formula for the input Hessian matrix of a deep ReLU network. Using this formula, we show that, for classification problems with many classes, if a prediction has high probability then including the Hessian term has a small impact on the interpretation. We prove this result by demonstrating that these conditions cause the Hessian matrix to be approximately rank one and its leading eigenvector to be almost parallel to the gradient of the loss. We empirically validate this theory by interpreting ImageNet classifiers. Second, we incorporate feature interdependencies by calculating the importance of group-features using a sparsity regularization term. We use an L0 - L1 relaxation technique along with proximal gradient descent to efficiently compute group-feature importance values. Our empirical results show that our method significantly improves deep learning interpretations.} }
Endnote
%0 Conference Paper %T Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation %A Sahil Singla %A Eric Wallace %A Shi Feng %A Soheil Feizi %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-singla19a %I PMLR %P 5848--5856 %U https://proceedings.mlr.press/v97/singla19a.html %V 97 %X Current saliency map interpretations for neural networks generally rely on two key assumptions. First, they use first-order approximations of the loss function, neglecting higher-order terms such as the loss curvature. Second, they evaluate each feature’s importance in isolation, ignoring feature interdependencies. This work studies the effect of relaxing these two assumptions. First, we characterize a closed-form formula for the input Hessian matrix of a deep ReLU network. Using this formula, we show that, for classification problems with many classes, if a prediction has high probability then including the Hessian term has a small impact on the interpretation. We prove this result by demonstrating that these conditions cause the Hessian matrix to be approximately rank one and its leading eigenvector to be almost parallel to the gradient of the loss. We empirically validate this theory by interpreting ImageNet classifiers. Second, we incorporate feature interdependencies by calculating the importance of group-features using a sparsity regularization term. We use an L0 - L1 relaxation technique along with proximal gradient descent to efficiently compute group-feature importance values. Our empirical results show that our method significantly improves deep learning interpretations.
APA
Singla, S., Wallace, E., Feng, S. & Feizi, S.. (2019). Understanding Impacts of High-Order Loss Approximations and Features in Deep Learning Interpretation. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:5848-5856 Available from https://proceedings.mlr.press/v97/singla19a.html.

Related Material