Depth and Feature Learning are Provably Beneficial for Neural Network Discriminators

Carles Domingo-Enrich

Depth and Feature Learning are Provably Beneficial for Neural Network Discriminators

Carles Domingo-Enrich

Proceedings of Thirty Fifth Conference on Learning Theory, PMLR 178:421-447, 2022.

Abstract

We construct pairs of distributions

$\mu_d, \nu_d$ on

$\mathbb{R}^d$ such that the quantity

$|\mathbb{E}_{x \sim \mu_d} [F(x)] - \mathbb{E}_{x \sim \nu_d} [F(x)]|$ decreases as

$\Omega(1/d^2)$ for some three-layer ReLU network

$F$ with polynomial width and weights, while declining exponentially in

$d$ if

$F$ is any two-layer network with polynomial weights. This shows that deep GAN discriminators are able to distinguish distributions that shallow discriminators cannot. Analogously, we build pairs of distributions

$\mu_d, \nu_d$ on

$\mathbb{R}^d$ such that

$|\mathbb{E}_{x \sim \mu_d} [F(x)] - \mathbb{E}_{x \sim \nu_d} [F(x)]|$ decreases as

$\Omega(1/(d\log d))$ for two-layer ReLU networks with polynomial weights, while declining exponentially for bounded-norm functions in the associated RKHS. This confirms that feature learning is beneficial for discriminators. Our bounds are based on Fourier transforms.

Cite this Paper

BibTeX


@InProceedings{pmlr-v178-domingo-enrich22a,
  title = 	 {Depth and Feature Learning are Provably Beneficial for Neural Network Discriminators},
  author =       {Domingo-Enrich, Carles},
  booktitle = 	 {Proceedings of Thirty Fifth Conference on Learning Theory},
  pages = 	 {421--447},
  year = 	 {2022},
  editor = 	 {Loh, Po-Ling and Raginsky, Maxim},
  volume = 	 {178},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {02--05 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v178/domingo-enrich22a/domingo-enrich22a.pdf},
  url = 	 {https://proceedings.mlr.press/v178/domingo-enrich22a.html},
  abstract = 	 {We construct pairs of distributions $\mu_d, \nu_d$ on $\mathbb{R}^d$ such that the quantity $|\mathbb{E}_{x \sim \mu_d} [F(x)] - \mathbb{E}_{x \sim \nu_d} [F(x)]|$ decreases as $\Omega(1/d^2)$ for some three-layer ReLU network $F$ with polynomial width and weights, while declining exponentially in $d$ if $F$ is any two-layer network with polynomial weights. This shows that deep GAN discriminators are able to distinguish distributions that shallow discriminators cannot. Analogously, we build pairs of distributions $\mu_d, \nu_d$ on $\mathbb{R}^d$ such that $|\mathbb{E}_{x \sim \mu_d} [F(x)] - \mathbb{E}_{x \sim \nu_d} [F(x)]|$ decreases as $\Omega(1/(d\log d))$ for two-layer ReLU networks with polynomial weights, while declining exponentially for bounded-norm functions in the associated RKHS. This confirms that feature learning is beneficial for discriminators. Our bounds are based on Fourier transforms.}
}

Endnote

%0 Conference Paper
%T Depth and Feature Learning are Provably Beneficial for Neural Network Discriminators
%A Carles Domingo-Enrich
%B Proceedings of Thirty Fifth Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2022
%E Po-Ling Loh
%E Maxim Raginsky	
%F pmlr-v178-domingo-enrich22a
%I PMLR
%P 421--447
%U https://proceedings.mlr.press/v178/domingo-enrich22a.html
%V 178
%X We construct pairs of distributions $\mu_d, \nu_d$ on $\mathbb{R}^d$ such that the quantity $|\mathbb{E}_{x \sim \mu_d} [F(x)] - \mathbb{E}_{x \sim \nu_d} [F(x)]|$ decreases as $\Omega(1/d^2)$ for some three-layer ReLU network $F$ with polynomial width and weights, while declining exponentially in $d$ if $F$ is any two-layer network with polynomial weights. This shows that deep GAN discriminators are able to distinguish distributions that shallow discriminators cannot. Analogously, we build pairs of distributions $\mu_d, \nu_d$ on $\mathbb{R}^d$ such that $|\mathbb{E}_{x \sim \mu_d} [F(x)] - \mathbb{E}_{x \sim \nu_d} [F(x)]|$ decreases as $\Omega(1/(d\log d))$ for two-layer ReLU networks with polynomial weights, while declining exponentially for bounded-norm functions in the associated RKHS. This confirms that feature learning is beneficial for discriminators. Our bounds are based on Fourier transforms.

APA


Domingo-Enrich, C.. (2022). Depth and Feature Learning are Provably Beneficial for Neural Network Discriminators. Proceedings of Thirty Fifth Conference on Learning Theory, in Proceedings of Machine Learning Research 178:421-447 Available from https://proceedings.mlr.press/v178/domingo-enrich22a.html.

Related Material

Download PDF