Feature Learning and Signal Propagation in Deep Neural Networks

Yizhang Lou, Chris E Mingard, Soufiane Hayou
Proceedings of the 39th International Conference on Machine Learning, PMLR 162:14248-14282, 2022.

Abstract

Recent work by Baratin et al. (2021) sheds light on an intriguing pattern that occurs during the training of deep neural networks: some layers align much more with data compared to other layers (where the alignment is defined as the normalize euclidean product of the tangent features matrix and the data labels matrix). The curve of the alignment as a function of layer index (generally) exhibits a ascent-descent pattern where the maximum is reached for some hidden layer. In this work, we provide the first explanation for this phenomenon. We introduce the Equilibrium Hypothesis which connects this alignment pattern to signal propagation in deep neural networks. Our experiments demonstrate an excellent match with the theoretical predictions.

Cite this Paper


BibTeX
@InProceedings{pmlr-v162-lou22a, title = {Feature Learning and Signal Propagation in Deep Neural Networks}, author = {Lou, Yizhang and Mingard, Chris E and Hayou, Soufiane}, booktitle = {Proceedings of the 39th International Conference on Machine Learning}, pages = {14248--14282}, year = {2022}, editor = {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan}, volume = {162}, series = {Proceedings of Machine Learning Research}, month = {17--23 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v162/lou22a/lou22a.pdf}, url = {https://proceedings.mlr.press/v162/lou22a.html}, abstract = {Recent work by Baratin et al. (2021) sheds light on an intriguing pattern that occurs during the training of deep neural networks: some layers align much more with data compared to other layers (where the alignment is defined as the normalize euclidean product of the tangent features matrix and the data labels matrix). The curve of the alignment as a function of layer index (generally) exhibits a ascent-descent pattern where the maximum is reached for some hidden layer. In this work, we provide the first explanation for this phenomenon. We introduce the Equilibrium Hypothesis which connects this alignment pattern to signal propagation in deep neural networks. Our experiments demonstrate an excellent match with the theoretical predictions.} }
Endnote
%0 Conference Paper %T Feature Learning and Signal Propagation in Deep Neural Networks %A Yizhang Lou %A Chris E Mingard %A Soufiane Hayou %B Proceedings of the 39th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Kamalika Chaudhuri %E Stefanie Jegelka %E Le Song %E Csaba Szepesvari %E Gang Niu %E Sivan Sabato %F pmlr-v162-lou22a %I PMLR %P 14248--14282 %U https://proceedings.mlr.press/v162/lou22a.html %V 162 %X Recent work by Baratin et al. (2021) sheds light on an intriguing pattern that occurs during the training of deep neural networks: some layers align much more with data compared to other layers (where the alignment is defined as the normalize euclidean product of the tangent features matrix and the data labels matrix). The curve of the alignment as a function of layer index (generally) exhibits a ascent-descent pattern where the maximum is reached for some hidden layer. In this work, we provide the first explanation for this phenomenon. We introduce the Equilibrium Hypothesis which connects this alignment pattern to signal propagation in deep neural networks. Our experiments demonstrate an excellent match with the theoretical predictions.
APA
Lou, Y., Mingard, C.E. & Hayou, S.. (2022). Feature Learning and Signal Propagation in Deep Neural Networks. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:14248-14282 Available from https://proceedings.mlr.press/v162/lou22a.html.

Related Material