Gated Feedback Recurrent Neural Networks

Junyoung Chung; Caglar Gulcehre; Kyunghyun Cho; Yoshua Bengio

Gated Feedback Recurrent Neural Networks

Junyoung Chung, Caglar Gulcehre, Kyunghyun Cho, Yoshua Bengio

Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:2067-2075, 2015.

Abstract

In this work, we propose a novel recurrent neural network (RNN) architecture. The proposed RNN, gated-feedback RNN (GF-RNN), extends the existing approach of stacking multiple recurrent layers by allowing and controlling signals flowing from upper recurrent layers to lower layers using a global gating unit for each pair of layers. The recurrent signals exchanged between layers are gated adaptively based on the previous hidden states and the current input. We evaluated the proposed GF-RNN with different types of recurrent units, such as tanh, long short-term memory and gated recurrent units, on the tasks of character-level language modeling and Python program evaluation. Our empirical evaluation of different RNN units, revealed that in both tasks, the GF-RNN outperforms the conventional approaches to build deep stacked RNNs. We suggest that the improvement arises because the GF-RNN can adaptively assign different layers to different timescales and layer-to-layer interactions (including the top-down ones which are not usually present in a stacked RNN) by learning to gate these interactions.

Cite this Paper

BibTeX


@InProceedings{pmlr-v37-chung15,
  title = 	 {Gated Feedback Recurrent Neural Networks},
  author = 	 {Chung, Junyoung and Gulcehre, Caglar and Cho, Kyunghyun and Bengio, Yoshua},
  booktitle = 	 {Proceedings of the 32nd International Conference on Machine Learning},
  pages = 	 {2067--2075},
  year = 	 {2015},
  editor = 	 {Bach, Francis and Blei, David},
  volume = 	 {37},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Lille, France},
  month = 	 {07--09 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v37/chung15.pdf},
  url = 	 {https://proceedings.mlr.press/v37/chung15.html},
  abstract = 	 {In this work, we propose a novel recurrent neural network (RNN) architecture. The proposed RNN, gated-feedback RNN (GF-RNN), extends the existing approach of stacking multiple recurrent layers by allowing and controlling signals flowing from upper recurrent layers to lower layers using a global gating unit for each pair of layers. The recurrent signals exchanged between layers are gated adaptively based on the previous hidden states and the current input. We evaluated the proposed GF-RNN with different types of recurrent units, such as tanh, long short-term memory and gated recurrent units, on the tasks of character-level language modeling and Python program evaluation. Our empirical evaluation of different RNN units, revealed that in both tasks, the GF-RNN outperforms the conventional approaches to build deep stacked RNNs. We suggest that the improvement arises because the GF-RNN can adaptively assign different layers to different timescales and layer-to-layer interactions (including the top-down ones which are not usually present in a stacked RNN) by learning to gate these interactions.}
}

Endnote

%0 Conference Paper
%T Gated Feedback Recurrent Neural Networks
%A Junyoung Chung
%A Caglar Gulcehre
%A Kyunghyun Cho
%A Yoshua Bengio
%B Proceedings of the 32nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2015
%E Francis Bach
%E David Blei	
%F pmlr-v37-chung15
%I PMLR
%P 2067--2075
%U https://proceedings.mlr.press/v37/chung15.html
%V 37
%X In this work, we propose a novel recurrent neural network (RNN) architecture. The proposed RNN, gated-feedback RNN (GF-RNN), extends the existing approach of stacking multiple recurrent layers by allowing and controlling signals flowing from upper recurrent layers to lower layers using a global gating unit for each pair of layers. The recurrent signals exchanged between layers are gated adaptively based on the previous hidden states and the current input. We evaluated the proposed GF-RNN with different types of recurrent units, such as tanh, long short-term memory and gated recurrent units, on the tasks of character-level language modeling and Python program evaluation. Our empirical evaluation of different RNN units, revealed that in both tasks, the GF-RNN outperforms the conventional approaches to build deep stacked RNNs. We suggest that the improvement arises because the GF-RNN can adaptively assign different layers to different timescales and layer-to-layer interactions (including the top-down ones which are not usually present in a stacked RNN) by learning to gate these interactions.

RIS


TY  - CPAPER
TI  - Gated Feedback Recurrent Neural Networks
AU  - Junyoung Chung
AU  - Caglar Gulcehre
AU  - Kyunghyun Cho
AU  - Yoshua Bengio
BT  - Proceedings of the 32nd International Conference on Machine Learning
DA  - 2015/06/01
ED  - Francis Bach
ED  - David Blei	
ID  - pmlr-v37-chung15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 37
SP  - 2067
EP  - 2075
L1  - http://proceedings.mlr.press/v37/chung15.pdf
UR  - https://proceedings.mlr.press/v37/chung15.html
AB  - In this work, we propose a novel recurrent neural network (RNN) architecture. The proposed RNN, gated-feedback RNN (GF-RNN), extends the existing approach of stacking multiple recurrent layers by allowing and controlling signals flowing from upper recurrent layers to lower layers using a global gating unit for each pair of layers. The recurrent signals exchanged between layers are gated adaptively based on the previous hidden states and the current input. We evaluated the proposed GF-RNN with different types of recurrent units, such as tanh, long short-term memory and gated recurrent units, on the tasks of character-level language modeling and Python program evaluation. Our empirical evaluation of different RNN units, revealed that in both tasks, the GF-RNN outperforms the conventional approaches to build deep stacked RNNs. We suggest that the improvement arises because the GF-RNN can adaptively assign different layers to different timescales and layer-to-layer interactions (including the top-down ones which are not usually present in a stacked RNN) by learning to gate these interactions.
ER  -

APA


Chung, J., Gulcehre, C., Cho, K. & Bengio, Y.. (2015). Gated Feedback Recurrent Neural Networks. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:2067-2075 Available from https://proceedings.mlr.press/v37/chung15.html.

Gated Feedback Recurrent Neural Networks

Abstract

Cite this Paper

Related Material