StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization

Shida Wang, Qianxiao Li
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:50766-50793, 2024.

Abstract

In this paper, we investigate the long-term memory learning capabilities of state-space models (SSMs) from the perspective of parameterization. We prove that state-space models without any reparameterization exhibit a memory limitation similar to that of traditional RNNs: the target relationships that can be stably approximated by state-space models must have an exponential decaying memory. Our analysis identifies this “curse of memory” as a result of the recurrent weights converging to a stability boundary, suggesting that a reparameterization technique can be effective. To this end, we introduce a class of reparameterization techniques for SSMs that effectively lift its memory limitations. Besides improving approximation capabilities, we further illustrate that a principled choice of reparameterization scheme can also enhance optimization stability. We validate our findings using synthetic datasets, language models and image classifications.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-wang24ag, title = {{S}table{SSM}: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization}, author = {Wang, Shida and Li, Qianxiao}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {50766--50793}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/wang24ag/wang24ag.pdf}, url = {https://proceedings.mlr.press/v235/wang24ag.html}, abstract = {In this paper, we investigate the long-term memory learning capabilities of state-space models (SSMs) from the perspective of parameterization. We prove that state-space models without any reparameterization exhibit a memory limitation similar to that of traditional RNNs: the target relationships that can be stably approximated by state-space models must have an exponential decaying memory. Our analysis identifies this “curse of memory” as a result of the recurrent weights converging to a stability boundary, suggesting that a reparameterization technique can be effective. To this end, we introduce a class of reparameterization techniques for SSMs that effectively lift its memory limitations. Besides improving approximation capabilities, we further illustrate that a principled choice of reparameterization scheme can also enhance optimization stability. We validate our findings using synthetic datasets, language models and image classifications.} }
Endnote
%0 Conference Paper %T StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization %A Shida Wang %A Qianxiao Li %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-wang24ag %I PMLR %P 50766--50793 %U https://proceedings.mlr.press/v235/wang24ag.html %V 235 %X In this paper, we investigate the long-term memory learning capabilities of state-space models (SSMs) from the perspective of parameterization. We prove that state-space models without any reparameterization exhibit a memory limitation similar to that of traditional RNNs: the target relationships that can be stably approximated by state-space models must have an exponential decaying memory. Our analysis identifies this “curse of memory” as a result of the recurrent weights converging to a stability boundary, suggesting that a reparameterization technique can be effective. To this end, we introduce a class of reparameterization techniques for SSMs that effectively lift its memory limitations. Besides improving approximation capabilities, we further illustrate that a principled choice of reparameterization scheme can also enhance optimization stability. We validate our findings using synthetic datasets, language models and image classifications.
APA
Wang, S. & Li, Q.. (2024). StableSSM: Alleviating the Curse of Memory in State-space Models through Stable Reparameterization. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:50766-50793 Available from https://proceedings.mlr.press/v235/wang24ag.html.

Related Material