Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint

Eric C. Cyr; Mamikon A. Gulian; Ravi G. Patel; Mauro Perego; Nathaniel A. Trask

Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint

Eric C. Cyr, Mamikon A. Gulian, Ravi G. Patel, Mauro Perego, Nathaniel A. Trask

Proceedings of The First Mathematical and Scientific Machine Learning Conference, PMLR 107:512-536, 2020.

Abstract

Motivated by the gap between theoretical optimal approximation rates of deep neural networks (DNNs) and the accuracy realized in practice, we seek to improve the training of DNNs. The adoption of an adaptive basis viewpoint of DNNs leads to novel initializations and a hybrid least squares/gradient descent optimizer. We provide analysis of these techniques and illustrate via numerical examples dramatic increases in accuracy and convergence rate for benchmarks characterizing scientific applications where DNNs are currently used, including regression problems and physics-informed neural networks for the solution of partial differential equations.

Cite this Paper

BibTeX

@InProceedings{pmlr-v107-cyr20a,
  title = 	 {Robust Training and Initialization of Deep Neural Networks: {A}n Adaptive Basis Viewpoint},
  author =       {Cyr, Eric C. and Gulian, Mamikon A. and Patel, Ravi G. and Perego, Mauro and Trask, Nathaniel A.},
  booktitle = 	 {Proceedings of The First Mathematical and Scientific Machine Learning Conference},
  pages = 	 {512--536},
  year = 	 {2020},
  editor = 	 {Lu, Jianfeng and Ward, Rachel},
  volume = 	 {107},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {20--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v107/cyr20a/cyr20a.pdf},
  url = 	 {https://proceedings.mlr.press/v107/cyr20a.html},
  abstract = 	 {Motivated by the gap between theoretical optimal approximation rates of deep neural networks (DNNs) and the accuracy realized in practice, we seek to improve the training of DNNs. The adoption of an adaptive basis viewpoint of DNNs leads to novel initializations and a hybrid least squares/gradient descent optimizer. We provide analysis of these techniques and illustrate via numerical examples dramatic increases in accuracy and convergence rate for benchmarks characterizing scientific applications where DNNs are currently used, including regression problems and physics-informed neural networks for the solution of partial differential equations. }
}

Endnote

%0 Conference Paper
%T Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint
%A Eric C. Cyr
%A Mamikon A. Gulian
%A Ravi G. Patel
%A Mauro Perego
%A Nathaniel A. Trask
%B Proceedings of The First Mathematical and Scientific Machine Learning Conference
%C Proceedings of Machine Learning Research
%D 2020
%E Jianfeng Lu
%E Rachel Ward	
%F pmlr-v107-cyr20a
%I PMLR
%P 512--536
%U https://proceedings.mlr.press/v107/cyr20a.html
%V 107
%X Motivated by the gap between theoretical optimal approximation rates of deep neural networks (DNNs) and the accuracy realized in practice, we seek to improve the training of DNNs. The adoption of an adaptive basis viewpoint of DNNs leads to novel initializations and a hybrid least squares/gradient descent optimizer. We provide analysis of these techniques and illustrate via numerical examples dramatic increases in accuracy and convergence rate for benchmarks characterizing scientific applications where DNNs are currently used, including regression problems and physics-informed neural networks for the solution of partial differential equations.

APA

Cyr, E.C., Gulian, M.A., Patel, R.G., Perego, M. & Trask, N.A.. (2020). Robust Training and Initialization of Deep Neural Networks: An Adaptive Basis Viewpoint. Proceedings of The First Mathematical and Scientific Machine Learning Conference, in Proceedings of Machine Learning Research 107:512-536 Available from https://proceedings.mlr.press/v107/cyr20a.html.

Related Material

Download PDF