Estimating Information Flow in Deep Neural Networks

Ziv Goldfeld, Ewout Van Den Berg, Kristjan Greenewald, Igor Melnyk, Nam Nguyen, Brian Kingsbury, Yury Polyanskiy
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:2299-2308, 2019.

Abstract

We study the estimation of the mutual information I(X;T_$\ell$) between the input X to a deep neural network (DNN) and the output vector T_$\ell$ of its $\ell$-th hidden layer (an “internal representation”). Focusing on feedforward networks with fixed weights and noisy internal representations, we develop a rigorous framework for accurate estimation of I(X;T_$\ell$). By relating I(X;T_$\ell$) to information transmission over additive white Gaussian noise channels, we reveal that compression, i.e. reduction in I(X;T_$\ell$) over the course of training, is driven by progressive geometric clustering of the representations of samples from the same class. Experimental results verify this connection. Finally, we shift focus to purely deterministic DNNs, where I(X;T_$\ell$) is provably vacuous, and show that nevertheless, these models also cluster inputs belonging to the same class. The binning-based approximation of I(X;T_$\ell$) employed in past works to measure compression is identified as a measure of clustering, thus clarifying that these experiments were in fact tracking the same clustering phenomenon. Leveraging the clustering perspective, we provide new evidence that compression and generalization may not be causally related and discuss potential future research ideas.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-goldfeld19a, title = {Estimating Information Flow in Deep Neural Networks}, author = {Goldfeld, Ziv and Van Den Berg, Ewout and Greenewald, Kristjan and Melnyk, Igor and Nguyen, Nam and Kingsbury, Brian and Polyanskiy, Yury}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {2299--2308}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/goldfeld19a/goldfeld19a.pdf}, url = {https://proceedings.mlr.press/v97/goldfeld19a.html}, abstract = {We study the estimation of the mutual information I(X;T_$\ell$) between the input X to a deep neural network (DNN) and the output vector T_$\ell$ of its $\ell$-th hidden layer (an “internal representation”). Focusing on feedforward networks with fixed weights and noisy internal representations, we develop a rigorous framework for accurate estimation of I(X;T_$\ell$). By relating I(X;T_$\ell$) to information transmission over additive white Gaussian noise channels, we reveal that compression, i.e. reduction in I(X;T_$\ell$) over the course of training, is driven by progressive geometric clustering of the representations of samples from the same class. Experimental results verify this connection. Finally, we shift focus to purely deterministic DNNs, where I(X;T_$\ell$) is provably vacuous, and show that nevertheless, these models also cluster inputs belonging to the same class. The binning-based approximation of I(X;T_$\ell$) employed in past works to measure compression is identified as a measure of clustering, thus clarifying that these experiments were in fact tracking the same clustering phenomenon. Leveraging the clustering perspective, we provide new evidence that compression and generalization may not be causally related and discuss potential future research ideas.} }
Endnote
%0 Conference Paper %T Estimating Information Flow in Deep Neural Networks %A Ziv Goldfeld %A Ewout Van Den Berg %A Kristjan Greenewald %A Igor Melnyk %A Nam Nguyen %A Brian Kingsbury %A Yury Polyanskiy %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-goldfeld19a %I PMLR %P 2299--2308 %U https://proceedings.mlr.press/v97/goldfeld19a.html %V 97 %X We study the estimation of the mutual information I(X;T_$\ell$) between the input X to a deep neural network (DNN) and the output vector T_$\ell$ of its $\ell$-th hidden layer (an “internal representation”). Focusing on feedforward networks with fixed weights and noisy internal representations, we develop a rigorous framework for accurate estimation of I(X;T_$\ell$). By relating I(X;T_$\ell$) to information transmission over additive white Gaussian noise channels, we reveal that compression, i.e. reduction in I(X;T_$\ell$) over the course of training, is driven by progressive geometric clustering of the representations of samples from the same class. Experimental results verify this connection. Finally, we shift focus to purely deterministic DNNs, where I(X;T_$\ell$) is provably vacuous, and show that nevertheless, these models also cluster inputs belonging to the same class. The binning-based approximation of I(X;T_$\ell$) employed in past works to measure compression is identified as a measure of clustering, thus clarifying that these experiments were in fact tracking the same clustering phenomenon. Leveraging the clustering perspective, we provide new evidence that compression and generalization may not be causally related and discuss potential future research ideas.
APA
Goldfeld, Z., Van Den Berg, E., Greenewald, K., Melnyk, I., Nguyen, N., Kingsbury, B. & Polyanskiy, Y.. (2019). Estimating Information Flow in Deep Neural Networks. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:2299-2308 Available from https://proceedings.mlr.press/v97/goldfeld19a.html.

Related Material