An uncertainty principle for Linear Recurrent Neural Networks

Alexandre François, Antonio Orvieto, Francis Bach
Proceedings of Thirty Eighth Conference on Learning Theory, PMLR 291:2143-2187, 2025.

Abstract

We consider linear recurrent neural networks, which have become a key building block of sequence modeling due to their ability for stable and effective long-range modeling. In this paper, we aim at characterizing this ability on the simple but core copy task, whose goal is to build a linear filter of order $S$ that approximates the filter that looks $K$ time steps in the past (which we refer to as the shift-$K$ filter), where $K$ is larger than $S$. Using classical signal models and quadratic cost, we fully characterize the problem by providing lower bounds of approximation, as well as explicit filters that achieve this lower bound up to constants. The optimal performance highlights an uncertainty principle for this task: the optimal filter has to average values around the $K$-th time step in the past with a range (width) that is proportional to $K/S$.

Cite this Paper


BibTeX
@InProceedings{pmlr-v291-francois25a, title = {An uncertainty principle for Linear Recurrent Neural Networks}, author = {Fran\c{c}ois, Alexandre and Orvieto, Antonio and Bach, Francis}, booktitle = {Proceedings of Thirty Eighth Conference on Learning Theory}, pages = {2143--2187}, year = {2025}, editor = {Haghtalab, Nika and Moitra, Ankur}, volume = {291}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--04 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v291/main/assets/francois25a/francois25a.pdf}, url = {https://proceedings.mlr.press/v291/francois25a.html}, abstract = { We consider linear recurrent neural networks, which have become a key building block of sequence modeling due to their ability for stable and effective long-range modeling. In this paper, we aim at characterizing this ability on the simple but core copy task, whose goal is to build a linear filter of order $S$ that approximates the filter that looks $K$ time steps in the past (which we refer to as the shift-$K$ filter), where $K$ is larger than $S$. Using classical signal models and quadratic cost, we fully characterize the problem by providing lower bounds of approximation, as well as explicit filters that achieve this lower bound up to constants. The optimal performance highlights an uncertainty principle for this task: the optimal filter has to average values around the $K$-th time step in the past with a range (width) that is proportional to $K/S$.} }
Endnote
%0 Conference Paper %T An uncertainty principle for Linear Recurrent Neural Networks %A Alexandre François %A Antonio Orvieto %A Francis Bach %B Proceedings of Thirty Eighth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2025 %E Nika Haghtalab %E Ankur Moitra %F pmlr-v291-francois25a %I PMLR %P 2143--2187 %U https://proceedings.mlr.press/v291/francois25a.html %V 291 %X We consider linear recurrent neural networks, which have become a key building block of sequence modeling due to their ability for stable and effective long-range modeling. In this paper, we aim at characterizing this ability on the simple but core copy task, whose goal is to build a linear filter of order $S$ that approximates the filter that looks $K$ time steps in the past (which we refer to as the shift-$K$ filter), where $K$ is larger than $S$. Using classical signal models and quadratic cost, we fully characterize the problem by providing lower bounds of approximation, as well as explicit filters that achieve this lower bound up to constants. The optimal performance highlights an uncertainty principle for this task: the optimal filter has to average values around the $K$-th time step in the past with a range (width) that is proportional to $K/S$.
APA
François, A., Orvieto, A. & Bach, F.. (2025). An uncertainty principle for Linear Recurrent Neural Networks. Proceedings of Thirty Eighth Conference on Learning Theory, in Proceedings of Machine Learning Research 291:2143-2187 Available from https://proceedings.mlr.press/v291/francois25a.html.

Related Material