Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections

Zakaria Mhammedi, Andrew Hellicar, Ashfaqur Rahman, James Bailey
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:2401-2409, 2017.

Abstract

The problem of learning long-term dependencies in sequences using Recurrent Neural Networks (RNNs) is still a major challenge. Recent methods have been suggested to solve this problem by constraining the transition matrix to be unitary during training which ensures that its norm is equal to one and prevents exploding gradients. These methods either have limited expressiveness or scale poorly with the size of the network when compared with the simple RNN case, especially when using stochastic gradient descent with a small mini-batch size. Our contributions are as follows; we first show that constraining the transition matrix to be unitary is a special case of an orthogonal constraint. Then we present a new parametrisation of the transition matrix which allows efficient training of an RNN while ensuring that the matrix is always orthogonal. Our results show that the orthogonal constraint on the transition matrix applied through our parametrisation gives similar benefits to the unitary constraint, without the time complexity limitations.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-mhammedi17a, title = {Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections}, author = {Zakaria Mhammedi and Andrew Hellicar and Ashfaqur Rahman and James Bailey}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {2401--2409}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/mhammedi17a/mhammedi17a.pdf}, url = { http://proceedings.mlr.press/v70/mhammedi17a.html }, abstract = {The problem of learning long-term dependencies in sequences using Recurrent Neural Networks (RNNs) is still a major challenge. Recent methods have been suggested to solve this problem by constraining the transition matrix to be unitary during training which ensures that its norm is equal to one and prevents exploding gradients. These methods either have limited expressiveness or scale poorly with the size of the network when compared with the simple RNN case, especially when using stochastic gradient descent with a small mini-batch size. Our contributions are as follows; we first show that constraining the transition matrix to be unitary is a special case of an orthogonal constraint. Then we present a new parametrisation of the transition matrix which allows efficient training of an RNN while ensuring that the matrix is always orthogonal. Our results show that the orthogonal constraint on the transition matrix applied through our parametrisation gives similar benefits to the unitary constraint, without the time complexity limitations.} }
Endnote
%0 Conference Paper %T Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections %A Zakaria Mhammedi %A Andrew Hellicar %A Ashfaqur Rahman %A James Bailey %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-mhammedi17a %I PMLR %P 2401--2409 %U http://proceedings.mlr.press/v70/mhammedi17a.html %V 70 %X The problem of learning long-term dependencies in sequences using Recurrent Neural Networks (RNNs) is still a major challenge. Recent methods have been suggested to solve this problem by constraining the transition matrix to be unitary during training which ensures that its norm is equal to one and prevents exploding gradients. These methods either have limited expressiveness or scale poorly with the size of the network when compared with the simple RNN case, especially when using stochastic gradient descent with a small mini-batch size. Our contributions are as follows; we first show that constraining the transition matrix to be unitary is a special case of an orthogonal constraint. Then we present a new parametrisation of the transition matrix which allows efficient training of an RNN while ensuring that the matrix is always orthogonal. Our results show that the orthogonal constraint on the transition matrix applied through our parametrisation gives similar benefits to the unitary constraint, without the time complexity limitations.
APA
Mhammedi, Z., Hellicar, A., Rahman, A. & Bailey, J.. (2017). Efficient Orthogonal Parametrisation of Recurrent Neural Networks Using Householder Reflections. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:2401-2409 Available from http://proceedings.mlr.press/v70/mhammedi17a.html .

Related Material