Input Switched Affine Networks: An RNN Architecture Designed for Interpretability

Jakob N. Foerster; Justin Gilmer; Jascha Sohl-Dickstein; Jan Chorowski; David Sussillo

Input Switched Affine Networks: An RNN Architecture Designed for Interpretability

Jakob N. Foerster, Justin Gilmer, Jascha Sohl-Dickstein, Jan Chorowski, David Sussillo

Proceedings of the 34th International Conference on Machine Learning, PMLR 70:1136-1145, 2017.

Abstract

There exist many problem domains where the interpretability of neural network models is essential for deployment. Here we introduce a recurrent architecture composed of input-switched affine transformations – in other words an RNN without any explicit nonlinearities, but with input-dependent recurrent weights. This simple form allows the RNN to be analyzed via straightforward linear methods: we can exactly characterize the linear contribution of each input to the model predictions; we can use a change-of-basis to disentangle input, output, and computational hidden unit subspaces; we can fully reverse-engineer the architecture’s solution to a simple task. Despite this ease of interpretation, the input switched affine network achieves reasonable performance on a text modeling tasks, and allows greater computational efficiency than networks with standard nonlinearities.

Cite this Paper

BibTeX


@InProceedings{pmlr-v70-foerster17a,
  title = 	 {Input Switched Affine Networks: An {RNN} Architecture Designed for Interpretability},
  author =       {Jakob N. Foerster and Justin Gilmer and Jascha Sohl-Dickstein and Jan Chorowski and David Sussillo},
  booktitle = 	 {Proceedings of the 34th International Conference on Machine Learning},
  pages = 	 {1136--1145},
  year = 	 {2017},
  editor = 	 {Precup, Doina and Teh, Yee Whye},
  volume = 	 {70},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--11 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v70/foerster17a/foerster17a.pdf},
  url = 	 {https://proceedings.mlr.press/v70/foerster17a.html},
  abstract = 	 {There exist many problem domains where the interpretability of neural network models is essential for deployment. Here we introduce a recurrent architecture composed of input-switched affine transformations – in other words an RNN without any explicit nonlinearities, but with input-dependent recurrent weights. This simple form allows the RNN to be analyzed via straightforward linear methods: we can exactly characterize the linear contribution of each input to the model predictions; we can use a change-of-basis to disentangle input, output, and computational hidden unit subspaces; we can fully reverse-engineer the architecture’s solution to a simple task. Despite this ease of interpretation, the input switched affine network achieves reasonable performance on a text modeling tasks, and allows greater computational efficiency than networks with standard nonlinearities.}
}

Endnote

%0 Conference Paper
%T Input Switched Affine Networks: An RNN Architecture Designed for Interpretability
%A Jakob N. Foerster
%A Justin Gilmer
%A Jascha Sohl-Dickstein
%A Jan Chorowski
%A David Sussillo
%B Proceedings of the 34th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2017
%E Doina Precup
%E Yee Whye Teh	
%F pmlr-v70-foerster17a
%I PMLR
%P 1136--1145
%U https://proceedings.mlr.press/v70/foerster17a.html
%V 70
%X There exist many problem domains where the interpretability of neural network models is essential for deployment. Here we introduce a recurrent architecture composed of input-switched affine transformations – in other words an RNN without any explicit nonlinearities, but with input-dependent recurrent weights. This simple form allows the RNN to be analyzed via straightforward linear methods: we can exactly characterize the linear contribution of each input to the model predictions; we can use a change-of-basis to disentangle input, output, and computational hidden unit subspaces; we can fully reverse-engineer the architecture’s solution to a simple task. Despite this ease of interpretation, the input switched affine network achieves reasonable performance on a text modeling tasks, and allows greater computational efficiency than networks with standard nonlinearities.

APA


Foerster, J.N., Gilmer, J., Sohl-Dickstein, J., Chorowski, J. & Sussillo, D.. (2017). Input Switched Affine Networks: An RNN Architecture Designed for Interpretability. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:1136-1145 Available from https://proceedings.mlr.press/v70/foerster17a.html.

Related Material

Download PDF