Learning One Convolutional Layer with Overlapping Patches

Surbhi Goel; Adam Klivans; Raghu Meka

Learning One Convolutional Layer with Overlapping Patches

Surbhi Goel, Adam Klivans, Raghu Meka

Proceedings of the 35th International Conference on Machine Learning, PMLR 80:1783-1791, 2018.

Abstract

We give the first provably efficient algorithm for learning a one hidden layer convolutional network with respect to a general class of (potentially overlapping) patches under mild conditions on the underlying distribution. We prove that our framework captures commonly used schemes from computer vision, including one-dimensional and two-dimensional “patch and stride” convolutions. Our algorithm– Convotron– is inspired by recent work applying isotonic regression to learning neural networks. Convotron uses a simple, iterative update rule that is stochastic in nature and tolerant to noise (requires only that the conditional mean function is a one layer convolutional network, as opposed to the realizable setting). In contrast to gradient descent, Convotron requires no special initialization or learning-rate tuning to converge to the global optimum. We also point out that learning one hidden convolutional layer with respect to a Gaussian distribution and just one disjoint patch $P$ (the other patches may be arbitrary) is easy in the following sense: Convotron can efficiently recover the hidden weight vector by updating only in the direction of $P$.

Cite this Paper

BibTeX

@InProceedings{pmlr-v80-goel18a,
  title = 	 {Learning One Convolutional Layer with Overlapping Patches},
  author =       {Goel, Surbhi and Klivans, Adam and Meka, Raghu},
  booktitle = 	 {Proceedings of the 35th International Conference on Machine Learning},
  pages = 	 {1783--1791},
  year = 	 {2018},
  editor = 	 {Dy, Jennifer and Krause, Andreas},
  volume = 	 {80},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {10--15 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v80/goel18a/goel18a.pdf},
  url = 	 {https://proceedings.mlr.press/v80/goel18a.html},
  abstract = 	 {We give the first provably efficient algorithm for learning a one hidden layer convolutional network with respect to a general class of (potentially overlapping) patches under mild conditions on the underlying distribution. We prove that our framework captures commonly used schemes from computer vision, including one-dimensional and two-dimensional “patch and stride” convolutions. Our algorithm– Convotron– is inspired by recent work applying isotonic regression to learning neural networks. Convotron uses a simple, iterative update rule that is stochastic in nature and tolerant to noise (requires only that the conditional mean function is a one layer convolutional network, as opposed to the realizable setting). In contrast to gradient descent, Convotron requires no special initialization or learning-rate tuning to converge to the global optimum. We also point out that learning one hidden convolutional layer with respect to a Gaussian distribution and just one disjoint patch $P$ (the other patches may be arbitrary) is easy in the following sense: Convotron can efficiently recover the hidden weight vector by updating only in the direction of $P$.}
}

Endnote

%0 Conference Paper
%T Learning One Convolutional Layer with Overlapping Patches
%A Surbhi Goel
%A Adam Klivans
%A Raghu Meka
%B Proceedings of the 35th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2018
%E Jennifer Dy
%E Andreas Krause	
%F pmlr-v80-goel18a
%I PMLR
%P 1783--1791
%U https://proceedings.mlr.press/v80/goel18a.html
%V 80
%X We give the first provably efficient algorithm for learning a one hidden layer convolutional network with respect to a general class of (potentially overlapping) patches under mild conditions on the underlying distribution. We prove that our framework captures commonly used schemes from computer vision, including one-dimensional and two-dimensional “patch and stride” convolutions. Our algorithm– Convotron– is inspired by recent work applying isotonic regression to learning neural networks. Convotron uses a simple, iterative update rule that is stochastic in nature and tolerant to noise (requires only that the conditional mean function is a one layer convolutional network, as opposed to the realizable setting). In contrast to gradient descent, Convotron requires no special initialization or learning-rate tuning to converge to the global optimum. We also point out that learning one hidden convolutional layer with respect to a Gaussian distribution and just one disjoint patch $P$ (the other patches may be arbitrary) is easy in the following sense: Convotron can efficiently recover the hidden weight vector by updating only in the direction of $P$.

APA

Goel, S., Klivans, A. & Meka, R.. (2018). Learning One Convolutional Layer with Overlapping Patches. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:1783-1791 Available from https://proceedings.mlr.press/v80/goel18a.html.

Learning One Convolutional Layer with Overlapping Patches

Abstract

Cite this Paper

Related Material