Gradient Descent Finds the Global Optima of Two-Layer Physics-Informed Neural Networks

Yihang Gao, Yiqi Gu, Michael Ng
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:10676-10707, 2023.

Abstract

The main aim of this paper is to conduct the convergence analysis of the gradient descent for two-layer physics-informed neural networks (PINNs). Here, the loss function involves derivatives of neural network outputs with respect to its inputs, so the interaction between the trainable parameters is more complicated compared with simple regression and classification tasks. We first develop the positive definiteness of Gram matrices and prove that the gradient flow finds the global optima of the empirical loss under over-parameterization. Then, we demonstrate that the standard gradient descent converges to the global optima of the loss with proper choices of learning rates. The framework of our analysis works for various categories of PDEs (e.g., linear second-order PDEs) and common types of network initialization (LecunUniform etc.). Our theoretical results do not need a very strict hypothesis for training samples and have a looser requirement on the network width compared with some previous works.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-gao23b, title = {Gradient Descent Finds the Global Optima of Two-Layer Physics-Informed Neural Networks}, author = {Gao, Yihang and Gu, Yiqi and Ng, Michael}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {10676--10707}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/gao23b/gao23b.pdf}, url = {https://proceedings.mlr.press/v202/gao23b.html}, abstract = {The main aim of this paper is to conduct the convergence analysis of the gradient descent for two-layer physics-informed neural networks (PINNs). Here, the loss function involves derivatives of neural network outputs with respect to its inputs, so the interaction between the trainable parameters is more complicated compared with simple regression and classification tasks. We first develop the positive definiteness of Gram matrices and prove that the gradient flow finds the global optima of the empirical loss under over-parameterization. Then, we demonstrate that the standard gradient descent converges to the global optima of the loss with proper choices of learning rates. The framework of our analysis works for various categories of PDEs (e.g., linear second-order PDEs) and common types of network initialization (LecunUniform etc.). Our theoretical results do not need a very strict hypothesis for training samples and have a looser requirement on the network width compared with some previous works.} }
Endnote
%0 Conference Paper %T Gradient Descent Finds the Global Optima of Two-Layer Physics-Informed Neural Networks %A Yihang Gao %A Yiqi Gu %A Michael Ng %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-gao23b %I PMLR %P 10676--10707 %U https://proceedings.mlr.press/v202/gao23b.html %V 202 %X The main aim of this paper is to conduct the convergence analysis of the gradient descent for two-layer physics-informed neural networks (PINNs). Here, the loss function involves derivatives of neural network outputs with respect to its inputs, so the interaction between the trainable parameters is more complicated compared with simple regression and classification tasks. We first develop the positive definiteness of Gram matrices and prove that the gradient flow finds the global optima of the empirical loss under over-parameterization. Then, we demonstrate that the standard gradient descent converges to the global optima of the loss with proper choices of learning rates. The framework of our analysis works for various categories of PDEs (e.g., linear second-order PDEs) and common types of network initialization (LecunUniform etc.). Our theoretical results do not need a very strict hypothesis for training samples and have a looser requirement on the network width compared with some previous works.
APA
Gao, Y., Gu, Y. & Ng, M.. (2023). Gradient Descent Finds the Global Optima of Two-Layer Physics-Informed Neural Networks. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:10676-10707 Available from https://proceedings.mlr.press/v202/gao23b.html.

Related Material