Understanding the Representation and Computation of Multilayer Perceptrons: A Case Study in Speech Recognition


Tasha Nagamine, Nima Mesgarani ;
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:2564-2573, 2017.


Despite the recent success of deep learning, the nature of the transformations they apply to the input features remains poorly understood. This study provides an empirical framework to study the encoding properties of node activations in various layers of the network, and to construct the exact function applied to each data point in the form of a linear transform. These methods are used to discern and quantify properties of feed-forward neural networks trained to map acoustic features to phoneme labels. We show a selective and nonlinear warping of the feature space, achieved by forming prototypical functions to account for the possible variation of each class. This study provides a joint framework where the properties of node activations and the functions implemented by the network can be linked together.

Related Material