Open Problem: The landscape of the loss surfaces of multilayer networks

Anna Choromanska; Yann LeCun; Gérard Ben Arous

Open Problem: The landscape of the loss surfaces of multilayer networks

Anna Choromanska, Yann LeCun, Gérard Ben Arous

Proceedings of The 28th Conference on Learning Theory, PMLR 40:1756-1760, 2015.

Abstract

Deep learning has enjoyed a resurgence of interest in the last few years for such applications as image and speech recognition, or natural language processing. The vast majority of practical applications of deep learning focus on supervised learning, where the supervised loss function is minimized using stochastic gradient descent. The properties of this highly non-convex loss function, such as its landscape and the behavior of critical points (maxima, minima, and saddle points), as well as the reason why large- and small-size networks achieve radically different practical performance, are however very poorly understood. It was only recently shown that new results in spin-glass theory potentially may provide an explanation for these problems by establishing a connection between the loss function of the neural networks and the Hamiltonian of the spherical spin-glass models. The connection between both models relies on a number of possibly unrealistic assumptions, yet the empirical evidence suggests that the connection may exist in real. The question we pose is whether it is possible to drop some of these assumptions to establish a stronger connection between both models.

Cite this Paper

BibTeX


@InProceedings{pmlr-v40-Choromanska15,
  title = 	 {Open Problem: The landscape of the loss surfaces of multilayer networks},
  author = 	 {Choromanska, Anna and LeCun, Yann and Ben Arous, Gérard},
  booktitle = 	 {Proceedings of The 28th Conference on Learning Theory},
  pages = 	 {1756--1760},
  year = 	 {2015},
  editor = 	 {Grünwald, Peter and Hazan, Elad and Kale, Satyen},
  volume = 	 {40},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Paris, France},
  month = 	 {03--06 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v40/Choromanska15.pdf},
  url = 	 {https://proceedings.mlr.press/v40/Choromanska15.html},
  abstract = 	 {Deep learning has enjoyed a resurgence of interest in the last few years for such applications as image and speech recognition, or natural language processing. The vast majority of practical applications of deep learning focus on supervised learning, where the supervised loss function is minimized using stochastic gradient descent. The properties of this highly non-convex loss function, such as its landscape and the behavior of critical points (maxima, minima, and saddle points), as well as the reason why large- and small-size networks achieve radically different practical performance, are however very poorly understood. It was only recently shown that new results in spin-glass theory potentially may provide an explanation for these problems by establishing a connection between the loss function of the neural networks and the Hamiltonian of the spherical spin-glass models. The connection between both models relies on a number of possibly unrealistic assumptions, yet the empirical evidence suggests that the connection may exist in real. The question we pose is whether it is possible to drop some of these assumptions to establish a stronger connection between both models.}
}

Endnote

%0 Conference Paper
%T Open Problem: The landscape of the loss surfaces of multilayer networks
%A Anna Choromanska
%A Yann LeCun
%A Gérard Ben Arous
%B Proceedings of The 28th Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2015
%E Peter Grünwald
%E Elad Hazan
%E Satyen Kale	
%F pmlr-v40-Choromanska15
%I PMLR
%P 1756--1760
%U https://proceedings.mlr.press/v40/Choromanska15.html
%V 40
%X Deep learning has enjoyed a resurgence of interest in the last few years for such applications as image and speech recognition, or natural language processing. The vast majority of practical applications of deep learning focus on supervised learning, where the supervised loss function is minimized using stochastic gradient descent. The properties of this highly non-convex loss function, such as its landscape and the behavior of critical points (maxima, minima, and saddle points), as well as the reason why large- and small-size networks achieve radically different practical performance, are however very poorly understood. It was only recently shown that new results in spin-glass theory potentially may provide an explanation for these problems by establishing a connection between the loss function of the neural networks and the Hamiltonian of the spherical spin-glass models. The connection between both models relies on a number of possibly unrealistic assumptions, yet the empirical evidence suggests that the connection may exist in real. The question we pose is whether it is possible to drop some of these assumptions to establish a stronger connection between both models.

RIS


TY  - CPAPER
TI  - Open Problem: The landscape of the loss surfaces of multilayer networks
AU  - Anna Choromanska
AU  - Yann LeCun
AU  - Gérard Ben Arous
BT  - Proceedings of The 28th Conference on Learning Theory
DA  - 2015/06/26
ED  - Peter Grünwald
ED  - Elad Hazan
ED  - Satyen Kale	
ID  - pmlr-v40-Choromanska15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 40
SP  - 1756
EP  - 1760
L1  - http://proceedings.mlr.press/v40/Choromanska15.pdf
UR  - https://proceedings.mlr.press/v40/Choromanska15.html
AB  - Deep learning has enjoyed a resurgence of interest in the last few years for such applications as image and speech recognition, or natural language processing. The vast majority of practical applications of deep learning focus on supervised learning, where the supervised loss function is minimized using stochastic gradient descent. The properties of this highly non-convex loss function, such as its landscape and the behavior of critical points (maxima, minima, and saddle points), as well as the reason why large- and small-size networks achieve radically different practical performance, are however very poorly understood. It was only recently shown that new results in spin-glass theory potentially may provide an explanation for these problems by establishing a connection between the loss function of the neural networks and the Hamiltonian of the spherical spin-glass models. The connection between both models relies on a number of possibly unrealistic assumptions, yet the empirical evidence suggests that the connection may exist in real. The question we pose is whether it is possible to drop some of these assumptions to establish a stronger connection between both models.
ER  -

APA


Choromanska, A., LeCun, Y. & Ben Arous, G.. (2015). Open Problem: The landscape of the loss surfaces of multilayer networks. Proceedings of The 28th Conference on Learning Theory, in Proceedings of Machine Learning Research 40:1756-1760 Available from https://proceedings.mlr.press/v40/Choromanska15.html.

Related Material

Download PDF