[edit]
An Upper Limit of Decaying Rate with Respect to Frequency in Linear Frequency Principle Model
Proceedings of Mathematical and Scientific Machine Learning, PMLR 190:205-214, 2022.
Abstract
Deep neural network (DNN) usually learns the target function from low to high frequency, which is called frequency principle or spectral bias. This frequency principle sheds light on a high-frequency curse of DNNs — difficult to learn high-frequency information. Inspired by the frequency principle, a series of works are devoted to develop algorithms for overcoming the high-frequency curse. A natural question arises: what is the upper limit of the decaying rate w.r.t. frequency when one trains a DNN? In this work, we abstract a paradigm for modeling and analysis of algorithm suitable for supervised learning problems from the sufficiently wide two-layer neural network, i.e., linear frequency principle (LFP) model. Our theory confirms that there is a critical decaying rate w.r.t. frequency in LFP model. It is precisely because of the existence of this limit that a sufficiently wide DNN interpolates the training data by a function with a certain regularity. However, if the decay rate of an algrithm in our paradigm is above such upper limit, then this algorithm interpolates the training data by a trivial function, i.e., a function is only non-zero at training data points. This work rigorously proves that the high-frequency curse is an intrinsic difficulty of the LFP model, which provides similar insight to DNN.