On the Memory Mechanism of Tensor-Power Recurrent Models

Hejia Qiu, Chao Li, Ying Weng, Zhun Sun, Xingyu He, Qibin Zhao
Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:3682-3690, 2021.

Abstract

Tensor-power (TP) recurrent model is a family of non-linear dynamical systems, of which the recurrence relation consists of a p-fold (a.k.a., degree-p) tensor product. Despite such the model frequently appears in the advanced recurrent neural networks (RNNs), to this date there is limited study on its memory property, a critical characteristic in sequence tasks. In this work, we conduct a thorough investigation of the memory mechanism of TP recurrent models. Theoretically, we prove that a large degree p is an essential condition to achieve the long memory effect, yet it would lead to unstable dynamical behaviors. Empirically, we tackle this issue by extending the degree p from discrete to a differentiable domain, such that it is efficiently learnable from a variety of datasets. Taken together, the new model is expected to benefit from the long memory effect in a stable manner. We experimentally show that the proposed model achieves competitive performance compared to various advanced RNNs in both the single-cell and seq2seq architectures.

Cite this Paper


BibTeX
@InProceedings{pmlr-v130-qiu21a, title = { On the Memory Mechanism of Tensor-Power Recurrent Models }, author = {Qiu, Hejia and Li, Chao and Weng, Ying and Sun, Zhun and He, Xingyu and Zhao, Qibin}, booktitle = {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics}, pages = {3682--3690}, year = {2021}, editor = {Banerjee, Arindam and Fukumizu, Kenji}, volume = {130}, series = {Proceedings of Machine Learning Research}, month = {13--15 Apr}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v130/qiu21a/qiu21a.pdf}, url = {https://proceedings.mlr.press/v130/qiu21a.html}, abstract = { Tensor-power (TP) recurrent model is a family of non-linear dynamical systems, of which the recurrence relation consists of a p-fold (a.k.a., degree-p) tensor product. Despite such the model frequently appears in the advanced recurrent neural networks (RNNs), to this date there is limited study on its memory property, a critical characteristic in sequence tasks. In this work, we conduct a thorough investigation of the memory mechanism of TP recurrent models. Theoretically, we prove that a large degree p is an essential condition to achieve the long memory effect, yet it would lead to unstable dynamical behaviors. Empirically, we tackle this issue by extending the degree p from discrete to a differentiable domain, such that it is efficiently learnable from a variety of datasets. Taken together, the new model is expected to benefit from the long memory effect in a stable manner. We experimentally show that the proposed model achieves competitive performance compared to various advanced RNNs in both the single-cell and seq2seq architectures. } }
Endnote
%0 Conference Paper %T On the Memory Mechanism of Tensor-Power Recurrent Models %A Hejia Qiu %A Chao Li %A Ying Weng %A Zhun Sun %A Xingyu He %A Qibin Zhao %B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2021 %E Arindam Banerjee %E Kenji Fukumizu %F pmlr-v130-qiu21a %I PMLR %P 3682--3690 %U https://proceedings.mlr.press/v130/qiu21a.html %V 130 %X Tensor-power (TP) recurrent model is a family of non-linear dynamical systems, of which the recurrence relation consists of a p-fold (a.k.a., degree-p) tensor product. Despite such the model frequently appears in the advanced recurrent neural networks (RNNs), to this date there is limited study on its memory property, a critical characteristic in sequence tasks. In this work, we conduct a thorough investigation of the memory mechanism of TP recurrent models. Theoretically, we prove that a large degree p is an essential condition to achieve the long memory effect, yet it would lead to unstable dynamical behaviors. Empirically, we tackle this issue by extending the degree p from discrete to a differentiable domain, such that it is efficiently learnable from a variety of datasets. Taken together, the new model is expected to benefit from the long memory effect in a stable manner. We experimentally show that the proposed model achieves competitive performance compared to various advanced RNNs in both the single-cell and seq2seq architectures.
APA
Qiu, H., Li, C., Weng, Y., Sun, Z., He, X. & Zhao, Q.. (2021). On the Memory Mechanism of Tensor-Power Recurrent Models . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:3682-3690 Available from https://proceedings.mlr.press/v130/qiu21a.html.

Related Material