The Multilinear Structure of ReLU Networks

Thomas Laurent, James Brecht
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:2908-2916, 2018.

Abstract

We study the loss surface of neural networks equipped with a hinge loss criterion and ReLU or leaky ReLU nonlinearities. Any such network defines a piecewise multilinear form in parameter space. By appealing to harmonic analysis we show that all local minima of such network are non-differentiable, except for those minima that occur in a region of parameter space where the loss surface is perfectly flat. Non-differentiable minima are therefore not technicalities or pathologies; they are heart of the problem when investigating the loss of ReLU networks. As a consequence, we must employ techniques from nonsmooth analysis to study these loss surfaces. We show how to apply these techniques in some illustrative cases.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-laurent18b, title = {The Multilinear Structure of {R}e{LU} Networks}, author = {Laurent, Thomas and von Brecht, James}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {2908--2916}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/laurent18b/laurent18b.pdf}, url = {https://proceedings.mlr.press/v80/laurent18b.html}, abstract = {We study the loss surface of neural networks equipped with a hinge loss criterion and ReLU or leaky ReLU nonlinearities. Any such network defines a piecewise multilinear form in parameter space. By appealing to harmonic analysis we show that all local minima of such network are non-differentiable, except for those minima that occur in a region of parameter space where the loss surface is perfectly flat. Non-differentiable minima are therefore not technicalities or pathologies; they are heart of the problem when investigating the loss of ReLU networks. As a consequence, we must employ techniques from nonsmooth analysis to study these loss surfaces. We show how to apply these techniques in some illustrative cases.} }
Endnote
%0 Conference Paper %T The Multilinear Structure of ReLU Networks %A Thomas Laurent %A James Brecht %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-laurent18b %I PMLR %P 2908--2916 %U https://proceedings.mlr.press/v80/laurent18b.html %V 80 %X We study the loss surface of neural networks equipped with a hinge loss criterion and ReLU or leaky ReLU nonlinearities. Any such network defines a piecewise multilinear form in parameter space. By appealing to harmonic analysis we show that all local minima of such network are non-differentiable, except for those minima that occur in a region of parameter space where the loss surface is perfectly flat. Non-differentiable minima are therefore not technicalities or pathologies; they are heart of the problem when investigating the loss of ReLU networks. As a consequence, we must employ techniques from nonsmooth analysis to study these loss surfaces. We show how to apply these techniques in some illustrative cases.
APA
Laurent, T. & Brecht, J.. (2018). The Multilinear Structure of ReLU Networks. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:2908-2916 Available from https://proceedings.mlr.press/v80/laurent18b.html.

Related Material