An Upper Limit of Decaying Rate with Respect to Frequency in Linear Frequency Principle Model

Tao Luo, Zheng Ma, Zhiwei Wang, Zhiqin John Xu, Yaoyu Zhang
Proceedings of Mathematical and Scientific Machine Learning, PMLR 190:205-214, 2022.

Abstract

Deep neural network (DNN) usually learns the target function from low to high frequency, which is called frequency principle or spectral bias. This frequency principle sheds light on a high-frequency curse of DNNs — difficult to learn high-frequency information. Inspired by the frequency principle, a series of works are devoted to develop algorithms for overcoming the high-frequency curse. A natural question arises: what is the upper limit of the decaying rate w.r.t. frequency when one trains a DNN? In this work, we abstract a paradigm for modeling and analysis of algorithm suitable for supervised learning problems from the sufficiently wide two-layer neural network, i.e., linear frequency principle (LFP) model. Our theory confirms that there is a critical decaying rate w.r.t. frequency in LFP model. It is precisely because of the existence of this limit that a sufficiently wide DNN interpolates the training data by a function with a certain regularity. However, if the decay rate of an algrithm in our paradigm is above such upper limit, then this algorithm interpolates the training data by a trivial function, i.e., a function is only non-zero at training data points. This work rigorously proves that the high-frequency curse is an intrinsic difficulty of the LFP model, which provides similar insight to DNN.

Cite this Paper


BibTeX
@InProceedings{pmlr-v190-luo22a, title = {An Upper Limit of Decaying Rate with Respect to Frequency in Linear Frequency Principle Model}, author = {Luo, Tao and Ma, Zheng and Wang, Zhiwei and John Xu, Zhiqin and Zhang, Yaoyu}, booktitle = {Proceedings of Mathematical and Scientific Machine Learning}, pages = {205--214}, year = {2022}, editor = {Dong, Bin and Li, Qianxiao and Wang, Lei and Xu, Zhi-Qin John}, volume = {190}, series = {Proceedings of Machine Learning Research}, month = {15--17 Aug}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v190/luo22a/luo22a.pdf}, url = {https://proceedings.mlr.press/v190/luo22a.html}, abstract = {Deep neural network (DNN) usually learns the target function from low to high frequency, which is called frequency principle or spectral bias. This frequency principle sheds light on a high-frequency curse of DNNs — difficult to learn high-frequency information. Inspired by the frequency principle, a series of works are devoted to develop algorithms for overcoming the high-frequency curse. A natural question arises: what is the upper limit of the decaying rate w.r.t. frequency when one trains a DNN? In this work, we abstract a paradigm for modeling and analysis of algorithm suitable for supervised learning problems from the sufficiently wide two-layer neural network, i.e., linear frequency principle (LFP) model. Our theory confirms that there is a critical decaying rate w.r.t. frequency in LFP model. It is precisely because of the existence of this limit that a sufficiently wide DNN interpolates the training data by a function with a certain regularity. However, if the decay rate of an algrithm in our paradigm is above such upper limit, then this algorithm interpolates the training data by a trivial function, i.e., a function is only non-zero at training data points. This work rigorously proves that the high-frequency curse is an intrinsic difficulty of the LFP model, which provides similar insight to DNN.} }
Endnote
%0 Conference Paper %T An Upper Limit of Decaying Rate with Respect to Frequency in Linear Frequency Principle Model %A Tao Luo %A Zheng Ma %A Zhiwei Wang %A Zhiqin John Xu %A Yaoyu Zhang %B Proceedings of Mathematical and Scientific Machine Learning %C Proceedings of Machine Learning Research %D 2022 %E Bin Dong %E Qianxiao Li %E Lei Wang %E Zhi-Qin John Xu %F pmlr-v190-luo22a %I PMLR %P 205--214 %U https://proceedings.mlr.press/v190/luo22a.html %V 190 %X Deep neural network (DNN) usually learns the target function from low to high frequency, which is called frequency principle or spectral bias. This frequency principle sheds light on a high-frequency curse of DNNs — difficult to learn high-frequency information. Inspired by the frequency principle, a series of works are devoted to develop algorithms for overcoming the high-frequency curse. A natural question arises: what is the upper limit of the decaying rate w.r.t. frequency when one trains a DNN? In this work, we abstract a paradigm for modeling and analysis of algorithm suitable for supervised learning problems from the sufficiently wide two-layer neural network, i.e., linear frequency principle (LFP) model. Our theory confirms that there is a critical decaying rate w.r.t. frequency in LFP model. It is precisely because of the existence of this limit that a sufficiently wide DNN interpolates the training data by a function with a certain regularity. However, if the decay rate of an algrithm in our paradigm is above such upper limit, then this algorithm interpolates the training data by a trivial function, i.e., a function is only non-zero at training data points. This work rigorously proves that the high-frequency curse is an intrinsic difficulty of the LFP model, which provides similar insight to DNN.
APA
Luo, T., Ma, Z., Wang, Z., John Xu, Z. & Zhang, Y.. (2022). An Upper Limit of Decaying Rate with Respect to Frequency in Linear Frequency Principle Model. Proceedings of Mathematical and Scientific Machine Learning, in Proceedings of Machine Learning Research 190:205-214 Available from https://proceedings.mlr.press/v190/luo22a.html.

Related Material