Inverting Supervised Representations with Autoregressive Neural Density Models

Charlie Nash, Nate Kushman, Christopher K.I. Williams
Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, PMLR 89:1620-1629, 2019.

Abstract

We present a method for feature interpretation that makes use of recent advances in autoregressive density estimation models to invert model representations. We train generative inversion models to express a distribution over input features conditioned on intermediate model representations. Insights into the invariances learned by supervised models can be gained by viewing samples from these inversion models. In addition, we can use these inversion models to estimate the mutual information between a model’s inputs and its intermediate representations, thus quantifying the amount of information preserved by the network at different stages. Using this method we examine the types of information preserved at different layers of convolutional neural networks, and explore the invariances induced by different architectural choices. Finally we show that the mutual information between inputs and network layers initially increases and then decreases over the course of training, supporting recent work by Shwartz-Ziv and Tishby (2017) on the information bottleneck theory of deep learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v89-nash19a, title = {Inverting Supervised Representations with Autoregressive Neural Density Models}, author = {Nash, Charlie and Kushman, Nate and Williams, Christopher K.I.}, booktitle = {Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics}, pages = {1620--1629}, year = {2019}, editor = {Chaudhuri, Kamalika and Sugiyama, Masashi}, volume = {89}, series = {Proceedings of Machine Learning Research}, month = {16--18 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v89/nash19a/nash19a.pdf}, url = {https://proceedings.mlr.press/v89/nash19a.html}, abstract = {We present a method for feature interpretation that makes use of recent advances in autoregressive density estimation models to invert model representations. We train generative inversion models to express a distribution over input features conditioned on intermediate model representations. Insights into the invariances learned by supervised models can be gained by viewing samples from these inversion models. In addition, we can use these inversion models to estimate the mutual information between a model’s inputs and its intermediate representations, thus quantifying the amount of information preserved by the network at different stages. Using this method we examine the types of information preserved at different layers of convolutional neural networks, and explore the invariances induced by different architectural choices. Finally we show that the mutual information between inputs and network layers initially increases and then decreases over the course of training, supporting recent work by Shwartz-Ziv and Tishby (2017) on the information bottleneck theory of deep learning.} }
Endnote
%0 Conference Paper %T Inverting Supervised Representations with Autoregressive Neural Density Models %A Charlie Nash %A Nate Kushman %A Christopher K.I. Williams %B Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Masashi Sugiyama %F pmlr-v89-nash19a %I PMLR %P 1620--1629 %U https://proceedings.mlr.press/v89/nash19a.html %V 89 %X We present a method for feature interpretation that makes use of recent advances in autoregressive density estimation models to invert model representations. We train generative inversion models to express a distribution over input features conditioned on intermediate model representations. Insights into the invariances learned by supervised models can be gained by viewing samples from these inversion models. In addition, we can use these inversion models to estimate the mutual information between a model’s inputs and its intermediate representations, thus quantifying the amount of information preserved by the network at different stages. Using this method we examine the types of information preserved at different layers of convolutional neural networks, and explore the invariances induced by different architectural choices. Finally we show that the mutual information between inputs and network layers initially increases and then decreases over the course of training, supporting recent work by Shwartz-Ziv and Tishby (2017) on the information bottleneck theory of deep learning.
APA
Nash, C., Kushman, N. & Williams, C.K.. (2019). Inverting Supervised Representations with Autoregressive Neural Density Models. Proceedings of the Twenty-Second International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 89:1620-1629 Available from https://proceedings.mlr.press/v89/nash19a.html.

Related Material