[edit]

# MLI Formula: A Nearly Scale-Invariant Solution with Noise Perturbation

*Proceedings of the 41st International Conference on Machine Learning*, PMLR 235:47912-47928, 2024.

#### Abstract

Monotonic Linear Interpolation (MLI) refers to the peculiar phenomenon that the error between the initial and converged model monotonically decreases along the linear interpolation, i.e., $(1-\alpha)\boldsymbol{\theta}_0 + \alpha \boldsymbol{\theta}_F$. Previous works focus on paired initial and converged points, relating MLI to the smoothness of the optimization trajectory. In this paper, we find a shocking fact that the error curves still exhibit a monotonic decrease when $\boldsymbol{\theta}_0$ is replaced with noise or even zero values, implying that the decreasing curve may be primarily related to the property of the converged model rather than the optimization trajectory. We further explore the relationship between $\alpha\boldsymbol{\theta}_F$ and $\boldsymbol{\theta}_F$ and propose scale invariance properties in various cases, including Generalized Scale Invariance (GSI), Rectified Scale Invariance (RSI), and Normalized Scale Invariance (NSI). From an inverse perspective, the MLI formula is essentially an equation that adds varying levels of noise (i.e., $(1-\alpha)\boldsymbol{\epsilon}$) to a nearly scale-invariant network (i.e., $\alpha \boldsymbol{\theta}_F$), resulting in a monotonically increasing error as the noise level rises. MLI is a special case where $\boldsymbol{\epsilon}$ is equal to $\boldsymbol{\theta}_0$.