Input Switched Affine Networks: An RNN Architecture Designed for Interpretability
[edit]
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:11361145, 2017.
Abstract
There exist many problem domains where the interpretability of neural network models is essential for deployment. Here we introduce a recurrent architecture composed of inputswitched affine transformations – in other words an RNN without any explicit nonlinearities, but with inputdependent recurrent weights. This simple form allows the RNN to be analyzed via straightforward linear methods: we can exactly characterize the linear contribution of each input to the model predictions; we can use a changeofbasis to disentangle input, output, and computational hidden unit subspaces; we can fully reverseengineer the architecture’s solution to a simple task. Despite this ease of interpretation, the input switched affine network achieves reasonable performance on a text modeling tasks, and allows greater computational efficiency than networks with standard nonlinearities.
Related Material


