Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning

Frederik Benzing, Marcelo Matheus Gauy, Asier Mujika, Anders Martinsson, Angelika Steger
Proceedings of the 36th International Conference on Machine Learning, PMLR 97:604-613, 2019.

Abstract

One of the central goals of Recurrent Neural Networks (RNNs) is to learn long-term dependencies in sequential data. Nevertheless, the most popular training method, Truncated Backpropagation through Time (TBPTT), categorically forbids learning dependencies beyond the truncation horizon. In contrast, the online training algorithm Real Time Recurrent Learning (RTRL) provides untruncated gradients, with the disadvantage of impractically large computational costs. Recently published approaches reduce these costs by providing noisy approximations of RTRL. We present a new approximation algorithm of RTRL, Optimal Kronecker-Sum Approximation (OK). We prove that OK is optimal for a class of approximations of RTRL, which includes all approaches published so far. Additionally, we show that OK has empirically negligible noise: Unlike previous algorithms it matches TBPTT in a real world task (character-level Penn TreeBank) and can exploit online parameter updates to outperform TBPTT in a synthetic string memorization task. Code available at GitHub.

Cite this Paper


BibTeX
@InProceedings{pmlr-v97-benzing19a, title = {Optimal {K}ronecker-Sum Approximation of Real Time Recurrent Learning}, author = {Benzing, Frederik and Gauy, Marcelo Matheus and Mujika, Asier and Martinsson, Anders and Steger, Angelika}, booktitle = {Proceedings of the 36th International Conference on Machine Learning}, pages = {604--613}, year = {2019}, editor = {Chaudhuri, Kamalika and Salakhutdinov, Ruslan}, volume = {97}, series = {Proceedings of Machine Learning Research}, month = {09--15 Jun}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v97/benzing19a/benzing19a.pdf}, url = {https://proceedings.mlr.press/v97/benzing19a.html}, abstract = {One of the central goals of Recurrent Neural Networks (RNNs) is to learn long-term dependencies in sequential data. Nevertheless, the most popular training method, Truncated Backpropagation through Time (TBPTT), categorically forbids learning dependencies beyond the truncation horizon. In contrast, the online training algorithm Real Time Recurrent Learning (RTRL) provides untruncated gradients, with the disadvantage of impractically large computational costs. Recently published approaches reduce these costs by providing noisy approximations of RTRL. We present a new approximation algorithm of RTRL, Optimal Kronecker-Sum Approximation (OK). We prove that OK is optimal for a class of approximations of RTRL, which includes all approaches published so far. Additionally, we show that OK has empirically negligible noise: Unlike previous algorithms it matches TBPTT in a real world task (character-level Penn TreeBank) and can exploit online parameter updates to outperform TBPTT in a synthetic string memorization task. Code available at GitHub.} }
Endnote
%0 Conference Paper %T Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning %A Frederik Benzing %A Marcelo Matheus Gauy %A Asier Mujika %A Anders Martinsson %A Angelika Steger %B Proceedings of the 36th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2019 %E Kamalika Chaudhuri %E Ruslan Salakhutdinov %F pmlr-v97-benzing19a %I PMLR %P 604--613 %U https://proceedings.mlr.press/v97/benzing19a.html %V 97 %X One of the central goals of Recurrent Neural Networks (RNNs) is to learn long-term dependencies in sequential data. Nevertheless, the most popular training method, Truncated Backpropagation through Time (TBPTT), categorically forbids learning dependencies beyond the truncation horizon. In contrast, the online training algorithm Real Time Recurrent Learning (RTRL) provides untruncated gradients, with the disadvantage of impractically large computational costs. Recently published approaches reduce these costs by providing noisy approximations of RTRL. We present a new approximation algorithm of RTRL, Optimal Kronecker-Sum Approximation (OK). We prove that OK is optimal for a class of approximations of RTRL, which includes all approaches published so far. Additionally, we show that OK has empirically negligible noise: Unlike previous algorithms it matches TBPTT in a real world task (character-level Penn TreeBank) and can exploit online parameter updates to outperform TBPTT in a synthetic string memorization task. Code available at GitHub.
APA
Benzing, F., Gauy, M.M., Mujika, A., Martinsson, A. & Steger, A.. (2019). Optimal Kronecker-Sum Approximation of Real Time Recurrent Learning. Proceedings of the 36th International Conference on Machine Learning, in Proceedings of Machine Learning Research 97:604-613 Available from https://proceedings.mlr.press/v97/benzing19a.html.

Related Material