Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs

Alon Brutzkus; Amir Globerson

Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs

Alon Brutzkus, Amir Globerson

Proceedings of the 34th International Conference on Machine Learning, PMLR 70:605-614, 2017.

Abstract

Deep learning models are often successfully trained using gradient descent, despite the worst case hardness of the underlying non-convex optimization problem. The key question is then under what conditions can one prove that optimization will succeed. Here we provide a strong result of this kind. We consider a neural net with one hidden layer and a convolutional structure with no overlap and a ReLU activation function. For this architecture we show that learning is NP-complete in the general case, but that when the input distribution is Gaussian, gradient descent converges to the global optimum in polynomial time. To the best of our knowledge, this is the first global optimality guarantee of gradient descent on a convolutional neural network with ReLU activations.

Cite this Paper

BibTeX


@InProceedings{pmlr-v70-brutzkus17a,
  title = 	 {Globally Optimal Gradient Descent for a {C}onv{N}et with {G}aussian Inputs},
  author =       {Alon Brutzkus and Amir Globerson},
  booktitle = 	 {Proceedings of the 34th International Conference on Machine Learning},
  pages = 	 {605--614},
  year = 	 {2017},
  editor = 	 {Precup, Doina and Teh, Yee Whye},
  volume = 	 {70},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--11 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v70/brutzkus17a/brutzkus17a.pdf},
  url = 	 {https://proceedings.mlr.press/v70/brutzkus17a.html},
  abstract = 	 {Deep learning models are often successfully trained using gradient descent, despite the worst case hardness of the underlying non-convex optimization problem. The key question is then under what conditions can one prove that optimization will succeed. Here we provide a strong result of this kind. We consider a neural net with one hidden layer and a convolutional structure with no overlap and a ReLU activation function. For this architecture we show that learning is NP-complete in the general case, but that when the input distribution is Gaussian, gradient descent converges to the global optimum in polynomial time. To the best of our knowledge, this is the first global optimality guarantee of gradient descent on a convolutional neural network with ReLU activations.}
}

Endnote

%0 Conference Paper
%T Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs
%A Alon Brutzkus
%A Amir Globerson
%B Proceedings of the 34th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2017
%E Doina Precup
%E Yee Whye Teh	
%F pmlr-v70-brutzkus17a
%I PMLR
%P 605--614
%U https://proceedings.mlr.press/v70/brutzkus17a.html
%V 70
%X Deep learning models are often successfully trained using gradient descent, despite the worst case hardness of the underlying non-convex optimization problem. The key question is then under what conditions can one prove that optimization will succeed. Here we provide a strong result of this kind. We consider a neural net with one hidden layer and a convolutional structure with no overlap and a ReLU activation function. For this architecture we show that learning is NP-complete in the general case, but that when the input distribution is Gaussian, gradient descent converges to the global optimum in polynomial time. To the best of our knowledge, this is the first global optimality guarantee of gradient descent on a convolutional neural network with ReLU activations.

APA


Brutzkus, A. & Globerson, A.. (2017). Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:605-614 Available from https://proceedings.mlr.press/v70/brutzkus17a.html.

Globally Optimal Gradient Descent for a ConvNet with Gaussian Inputs

Abstract

Cite this Paper

Related Material