Do RNN and LSTM have Long Memory?

Jingyu Zhao, Feiqing Huang, Jia Lv, Yanjie Duan, Zhen Qin, Guodong Li, Guangjian Tian
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:11365-11375, 2020.

Abstract

The LSTM network was proposed to overcome the difficulty in learning long-term dependence, and has made significant advancements in applications. With its success and drawbacks in mind, this paper raises the question - do RNN and LSTM have long memory? We answer it partially by proving that RNN and LSTM do not have long memory from a statistical perspective. A new definition for long memory networks is further introduced, and it requires the model weights to decay at a polynomial rate. To verify our theory, we convert RNN and LSTM into long memory networks by making a minimal modification, and their superiority is illustrated in modeling long-term dependence of various datasets.

Cite this Paper


BibTeX
@InProceedings{pmlr-v119-zhao20c, title = {Do {RNN} and {LSTM} have Long Memory?}, author = {Zhao, Jingyu and Huang, Feiqing and Lv, Jia and Duan, Yanjie and Qin, Zhen and Li, Guodong and Tian, Guangjian}, booktitle = {Proceedings of the 37th International Conference on Machine Learning}, pages = {11365--11375}, year = {2020}, editor = {III, Hal Daumé and Singh, Aarti}, volume = {119}, series = {Proceedings of Machine Learning Research}, month = {13--18 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v119/zhao20c/zhao20c.pdf}, url = {https://proceedings.mlr.press/v119/zhao20c.html}, abstract = {The LSTM network was proposed to overcome the difficulty in learning long-term dependence, and has made significant advancements in applications. With its success and drawbacks in mind, this paper raises the question - do RNN and LSTM have long memory? We answer it partially by proving that RNN and LSTM do not have long memory from a statistical perspective. A new definition for long memory networks is further introduced, and it requires the model weights to decay at a polynomial rate. To verify our theory, we convert RNN and LSTM into long memory networks by making a minimal modification, and their superiority is illustrated in modeling long-term dependence of various datasets.} }
Endnote
%0 Conference Paper %T Do RNN and LSTM have Long Memory? %A Jingyu Zhao %A Feiqing Huang %A Jia Lv %A Yanjie Duan %A Zhen Qin %A Guodong Li %A Guangjian Tian %B Proceedings of the 37th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2020 %E Hal Daumé III %E Aarti Singh %F pmlr-v119-zhao20c %I PMLR %P 11365--11375 %U https://proceedings.mlr.press/v119/zhao20c.html %V 119 %X The LSTM network was proposed to overcome the difficulty in learning long-term dependence, and has made significant advancements in applications. With its success and drawbacks in mind, this paper raises the question - do RNN and LSTM have long memory? We answer it partially by proving that RNN and LSTM do not have long memory from a statistical perspective. A new definition for long memory networks is further introduced, and it requires the model weights to decay at a polynomial rate. To verify our theory, we convert RNN and LSTM into long memory networks by making a minimal modification, and their superiority is illustrated in modeling long-term dependence of various datasets.
APA
Zhao, J., Huang, F., Lv, J., Duan, Y., Qin, Z., Li, G. & Tian, G.. (2020). Do RNN and LSTM have Long Memory?. Proceedings of the 37th International Conference on Machine Learning, in Proceedings of Machine Learning Research 119:11365-11375 Available from https://proceedings.mlr.press/v119/zhao20c.html.

Related Material