Reviving and Improving Recurrent Back-Propagation

Renjie Liao, Yuwen Xiong, Ethan Fetaya, Lisa Zhang, KiJung Yoon, Xaq Pitkow, Raquel Urtasun, Richard Zemel
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:3082-3091, 2018.

Abstract

In this paper, we revisit the recurrent back-propagation (RBP) algorithm, discuss the conditions under which it applies as well as how to satisfy them in deep neural networks. We show that RBP can be unstable and propose two variants based on conjugate gradient on the normal equations (CG-RBP) and Neumann series (Neumann-RBP). We further investigate the relationship between Neumann-RBP and back propagation through time (BPTT) and its truncated version (TBPTT). Our Neumann-RBP has the same time complexity as TBPTT but only requires constant memory, whereas TBPTT’s memory cost scales linearly with the number of truncation steps. We examine all RBP variants along with BPTT and TBPTT in three different application domains: associative memory with continuous Hopfield networks, document classification in citation networks using graph neural networks and hyperparameter optimization for fully connected networks. All experiments demonstrate that RBPs, especially the Neumann-RBP variant, are efficient and effective for optimizing convergent recurrent neural networks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v80-liao18c, title = {Reviving and Improving Recurrent Back-Propagation}, author = {Liao, Renjie and Xiong, Yuwen and Fetaya, Ethan and Zhang, Lisa and Yoon, KiJung and Pitkow, Xaq and Urtasun, Raquel and Zemel, Richard}, booktitle = {Proceedings of the 35th International Conference on Machine Learning}, pages = {3082--3091}, year = {2018}, editor = {Dy, Jennifer and Krause, Andreas}, volume = {80}, series = {Proceedings of Machine Learning Research}, month = {10--15 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v80/liao18c/liao18c.pdf}, url = {https://proceedings.mlr.press/v80/liao18c.html}, abstract = {In this paper, we revisit the recurrent back-propagation (RBP) algorithm, discuss the conditions under which it applies as well as how to satisfy them in deep neural networks. We show that RBP can be unstable and propose two variants based on conjugate gradient on the normal equations (CG-RBP) and Neumann series (Neumann-RBP). We further investigate the relationship between Neumann-RBP and back propagation through time (BPTT) and its truncated version (TBPTT). Our Neumann-RBP has the same time complexity as TBPTT but only requires constant memory, whereas TBPTT’s memory cost scales linearly with the number of truncation steps. We examine all RBP variants along with BPTT and TBPTT in three different application domains: associative memory with continuous Hopfield networks, document classification in citation networks using graph neural networks and hyperparameter optimization for fully connected networks. All experiments demonstrate that RBPs, especially the Neumann-RBP variant, are efficient and effective for optimizing convergent recurrent neural networks.} }
Endnote
%0 Conference Paper %T Reviving and Improving Recurrent Back-Propagation %A Renjie Liao %A Yuwen Xiong %A Ethan Fetaya %A Lisa Zhang %A KiJung Yoon %A Xaq Pitkow %A Raquel Urtasun %A Richard Zemel %B Proceedings of the 35th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2018 %E Jennifer Dy %E Andreas Krause %F pmlr-v80-liao18c %I PMLR %P 3082--3091 %U https://proceedings.mlr.press/v80/liao18c.html %V 80 %X In this paper, we revisit the recurrent back-propagation (RBP) algorithm, discuss the conditions under which it applies as well as how to satisfy them in deep neural networks. We show that RBP can be unstable and propose two variants based on conjugate gradient on the normal equations (CG-RBP) and Neumann series (Neumann-RBP). We further investigate the relationship between Neumann-RBP and back propagation through time (BPTT) and its truncated version (TBPTT). Our Neumann-RBP has the same time complexity as TBPTT but only requires constant memory, whereas TBPTT’s memory cost scales linearly with the number of truncation steps. We examine all RBP variants along with BPTT and TBPTT in three different application domains: associative memory with continuous Hopfield networks, document classification in citation networks using graph neural networks and hyperparameter optimization for fully connected networks. All experiments demonstrate that RBPs, especially the Neumann-RBP variant, are efficient and effective for optimizing convergent recurrent neural networks.
APA
Liao, R., Xiong, Y., Fetaya, E., Zhang, L., Yoon, K., Pitkow, X., Urtasun, R. & Zemel, R.. (2018). Reviving and Improving Recurrent Back-Propagation. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:3082-3091 Available from https://proceedings.mlr.press/v80/liao18c.html.

Related Material