L1-regularized Neural Networks are Improperly Learnable in Polynomial Time

Yuchen Zhang; Jason D. Lee; Michael I. Jordan

L1-regularized Neural Networks are Improperly Learnable in Polynomial Time

Yuchen Zhang, Jason D. Lee, Michael I. Jordan

Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:993-1001, 2016.

Abstract

We study the improper learning of multi-layer neural networks. Suppose that the neural network to be learned has k hidden layers and that the \ell_1-norm of the incoming weights of any neuron is bounded by L. We present a kernel-based method, such that with probability at least 1 - δ, it learns a predictor whose generalization error is at most εworse than that of the neural network. The sample complexity and the time complexity of the presented method are polynomial in the input dimension and in (1/ε,\log(1/δ),F(k,L)), where F(k,L) is a function depending on (k,L) and on the activation function, independent of the number of neurons. The algorithm applies to both sigmoid-like activation functions and ReLU-like activation functions. It implies that any sufficiently sparse neural network is learnable in polynomial time.

Cite this Paper

BibTeX


@InProceedings{pmlr-v48-zhangd16,
  title = 	 {L1-regularized Neural Networks are Improperly Learnable in Polynomial Time},
  author = 	 {Zhang, Yuchen and Lee, Jason D. and Jordan, Michael I.},
  booktitle = 	 {Proceedings of The 33rd International Conference on Machine Learning},
  pages = 	 {993--1001},
  year = 	 {2016},
  editor = 	 {Balcan, Maria Florina and Weinberger, Kilian Q.},
  volume = 	 {48},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {20--22 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v48/zhangd16.pdf},
  url = 	 {https://proceedings.mlr.press/v48/zhangd16.html},
  abstract = 	 {We study the improper learning of multi-layer neural networks. Suppose that the neural network to be learned has k hidden layers and that the \ell_1-norm of the incoming weights of any neuron is bounded by L. We present a kernel-based method, such that with probability at least 1 - δ, it learns a predictor whose generalization error is at most εworse than that of the neural network. The sample complexity and the time complexity of the presented method are polynomial in the input dimension and in (1/ε,\log(1/δ),F(k,L)), where F(k,L) is a function depending on (k,L) and on the activation function, independent of the number of neurons. The algorithm applies to both sigmoid-like activation functions and ReLU-like activation functions. It implies that any sufficiently sparse neural network is learnable in polynomial time.}
}

Endnote

%0 Conference Paper
%T L1-regularized Neural Networks are Improperly Learnable in Polynomial Time
%A Yuchen Zhang
%A Jason D. Lee
%A Michael I. Jordan
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger	
%F pmlr-v48-zhangd16
%I PMLR
%P 993--1001
%U https://proceedings.mlr.press/v48/zhangd16.html
%V 48
%X We study the improper learning of multi-layer neural networks. Suppose that the neural network to be learned has k hidden layers and that the \ell_1-norm of the incoming weights of any neuron is bounded by L. We present a kernel-based method, such that with probability at least 1 - δ, it learns a predictor whose generalization error is at most εworse than that of the neural network. The sample complexity and the time complexity of the presented method are polynomial in the input dimension and in (1/ε,\log(1/δ),F(k,L)), where F(k,L) is a function depending on (k,L) and on the activation function, independent of the number of neurons. The algorithm applies to both sigmoid-like activation functions and ReLU-like activation functions. It implies that any sufficiently sparse neural network is learnable in polynomial time.

RIS


TY  - CPAPER
TI  - L1-regularized Neural Networks are Improperly Learnable in Polynomial Time
AU  - Yuchen Zhang
AU  - Jason D. Lee
AU  - Michael I. Jordan
BT  - Proceedings of The 33rd International Conference on Machine Learning
DA  - 2016/06/11
ED  - Maria Florina Balcan
ED  - Kilian Q. Weinberger	
ID  - pmlr-v48-zhangd16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 48
SP  - 993
EP  - 1001
L1  - http://proceedings.mlr.press/v48/zhangd16.pdf
UR  - https://proceedings.mlr.press/v48/zhangd16.html
AB  - We study the improper learning of multi-layer neural networks. Suppose that the neural network to be learned has k hidden layers and that the \ell_1-norm of the incoming weights of any neuron is bounded by L. We present a kernel-based method, such that with probability at least 1 - δ, it learns a predictor whose generalization error is at most εworse than that of the neural network. The sample complexity and the time complexity of the presented method are polynomial in the input dimension and in (1/ε,\log(1/δ),F(k,L)), where F(k,L) is a function depending on (k,L) and on the activation function, independent of the number of neurons. The algorithm applies to both sigmoid-like activation functions and ReLU-like activation functions. It implies that any sufficiently sparse neural network is learnable in polynomial time.
ER  -

APA


Zhang, Y., Lee, J.D. & Jordan, M.I.. (2016). L1-regularized Neural Networks are Improperly Learnable in Polynomial Time. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:993-1001 Available from https://proceedings.mlr.press/v48/zhangd16.html.

Related Material

Download PDF