Leveraging Sparse Linear Layers for Debuggable Deep Networks

Eric Wong; Shibani Santurkar; Aleksander Madry

Leveraging Sparse Linear Layers for Debuggable Deep Networks

Eric Wong, Shibani Santurkar, Aleksander Madry

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:11205-11216, 2021.

Abstract

We show how fitting sparse linear models over learned deep feature representations can lead to more debuggable neural networks. These networks remain highly accurate while also being more amenable to human interpretation, as we demonstrate quantitatively and via human experiments. We further illustrate how the resulting sparse explanations can help to identify spurious correlations, explain misclassifications, and diagnose model biases in vision and language tasks.

Cite this Paper

BibTeX

@InProceedings{pmlr-v139-wong21b,
  title = 	 {Leveraging Sparse Linear Layers for Debuggable Deep Networks},
  author =       {Wong, Eric and Santurkar, Shibani and Madry, Aleksander},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {11205--11216},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/wong21b/wong21b.pdf},
  url = 	 {https://proceedings.mlr.press/v139/wong21b.html},
  abstract = 	 {We show how fitting sparse linear models over learned deep feature representations can lead to more debuggable neural networks. These networks remain highly accurate while also being more amenable to human interpretation, as we demonstrate quantitatively and via human experiments. We further illustrate how the resulting sparse explanations can help to identify spurious correlations, explain misclassifications, and diagnose model biases in vision and language tasks.}
}

Endnote

%0 Conference Paper
%T Leveraging Sparse Linear Layers for Debuggable Deep Networks
%A Eric Wong
%A Shibani Santurkar
%A Aleksander Madry
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-wong21b
%I PMLR
%P 11205--11216
%U https://proceedings.mlr.press/v139/wong21b.html
%V 139
%X We show how fitting sparse linear models over learned deep feature representations can lead to more debuggable neural networks. These networks remain highly accurate while also being more amenable to human interpretation, as we demonstrate quantitatively and via human experiments. We further illustrate how the resulting sparse explanations can help to identify spurious correlations, explain misclassifications, and diagnose model biases in vision and language tasks.

APA

Wong, E., Santurkar, S. & Madry, A.. (2021). Leveraging Sparse Linear Layers for Debuggable Deep Networks. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:11205-11216 Available from https://proceedings.mlr.press/v139/wong21b.html.

Leveraging Sparse Linear Layers for Debuggable Deep Networks

Abstract

Cite this Paper

Related Material