Input Switched Affine Networks: An RNN Architecture Designed for Interpretability

Jakob N. Foerster, Justin Gilmer, Jascha Sohl-Dickstein, Jan Chorowski, David Sussillo
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:1136-1145, 2017.

Abstract

There exist many problem domains where the interpretability of neural network models is essential for deployment. Here we introduce a recurrent architecture composed of input-switched affine transformations – in other words an RNN without any explicit nonlinearities, but with input-dependent recurrent weights. This simple form allows the RNN to be analyzed via straightforward linear methods: we can exactly characterize the linear contribution of each input to the model predictions; we can use a change-of-basis to disentangle input, output, and computational hidden unit subspaces; we can fully reverse-engineer the architecture’s solution to a simple task. Despite this ease of interpretation, the input switched affine network achieves reasonable performance on a text modeling tasks, and allows greater computational efficiency than networks with standard nonlinearities.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-foerster17a, title = {Input Switched Affine Networks: An {RNN} Architecture Designed for Interpretability}, author = {Jakob N. Foerster and Justin Gilmer and Jascha Sohl-Dickstein and Jan Chorowski and David Sussillo}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {1136--1145}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/foerster17a/foerster17a.pdf}, url = {https://proceedings.mlr.press/v70/foerster17a.html}, abstract = {There exist many problem domains where the interpretability of neural network models is essential for deployment. Here we introduce a recurrent architecture composed of input-switched affine transformations – in other words an RNN without any explicit nonlinearities, but with input-dependent recurrent weights. This simple form allows the RNN to be analyzed via straightforward linear methods: we can exactly characterize the linear contribution of each input to the model predictions; we can use a change-of-basis to disentangle input, output, and computational hidden unit subspaces; we can fully reverse-engineer the architecture’s solution to a simple task. Despite this ease of interpretation, the input switched affine network achieves reasonable performance on a text modeling tasks, and allows greater computational efficiency than networks with standard nonlinearities.} }
Endnote
%0 Conference Paper %T Input Switched Affine Networks: An RNN Architecture Designed for Interpretability %A Jakob N. Foerster %A Justin Gilmer %A Jascha Sohl-Dickstein %A Jan Chorowski %A David Sussillo %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-foerster17a %I PMLR %P 1136--1145 %U https://proceedings.mlr.press/v70/foerster17a.html %V 70 %X There exist many problem domains where the interpretability of neural network models is essential for deployment. Here we introduce a recurrent architecture composed of input-switched affine transformations – in other words an RNN without any explicit nonlinearities, but with input-dependent recurrent weights. This simple form allows the RNN to be analyzed via straightforward linear methods: we can exactly characterize the linear contribution of each input to the model predictions; we can use a change-of-basis to disentangle input, output, and computational hidden unit subspaces; we can fully reverse-engineer the architecture’s solution to a simple task. Despite this ease of interpretation, the input switched affine network achieves reasonable performance on a text modeling tasks, and allows greater computational efficiency than networks with standard nonlinearities.
APA
Foerster, J.N., Gilmer, J., Sohl-Dickstein, J., Chorowski, J. & Sussillo, D.. (2017). Input Switched Affine Networks: An RNN Architecture Designed for Interpretability. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:1136-1145 Available from https://proceedings.mlr.press/v70/foerster17a.html.

Related Material