Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs

Li Jing; Yichen Shen; Tena Dubcek; John Peurifoy; Scott Skirlo; Yann LeCun; Max Tegmark; Marin Soljačić

Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs

Li Jing, Yichen Shen, Tena Dubcek, John Peurifoy, Scott Skirlo, Yann LeCun, Max Tegmark, Marin Soljačić

Proceedings of the 34th International Conference on Machine Learning, PMLR 70:1733-1741, 2017.

Abstract

Using unitary (instead of general) matrices in artificial neural networks (ANNs) is a promising way to solve the gradient explosion/vanishing problem, as well as to enable ANNs to learn long-term correlations in the data. This approach appears particularly promising for Recurrent Neural Networks (RNNs). In this work, we present a new architecture for implementing an Efficient Unitary Neural Network (EUNNs); its main advantages can be summarized as follows. Firstly, the representation capacity of the unitary space in an EUNN is fully tunable, ranging from a subspace of SU(N) to the entire unitary space. Secondly, the computational complexity for training an EUNN is merely

$\mathcal{O}(1)$ per parameter. Finally, we test the performance of EUNNs on the standard copying task, the pixel-permuted MNIST digit recognition benchmark as well as the Speech Prediction Test (TIMIT). We find that our architecture significantly outperforms both other state-of-the-art unitary RNNs and the LSTM architecture, in terms of the final performance and/or the wall-clock training speed. EUNNs are thus promising alternatives to RNNs and LSTMs for a wide variety of applications.

Cite this Paper

BibTeX


@InProceedings{pmlr-v70-jing17a,
  title = 	 {Tunable Efficient Unitary Neural Networks ({EUNN}) and their application to {RNN}s},
  author =       {Li Jing and Yichen Shen and Tena Dubcek and John Peurifoy and Scott Skirlo and Yann LeCun and Max Tegmark and Marin Solja{\v{c}}i{\'c}},
  booktitle = 	 {Proceedings of the 34th International Conference on Machine Learning},
  pages = 	 {1733--1741},
  year = 	 {2017},
  editor = 	 {Precup, Doina and Teh, Yee Whye},
  volume = 	 {70},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--11 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v70/jing17a/jing17a.pdf},
  url = 	 {https://proceedings.mlr.press/v70/jing17a.html},
  abstract = 	 {Using unitary (instead of general) matrices in artificial neural networks (ANNs) is a promising way to solve the gradient explosion/vanishing problem, as well as to enable ANNs to learn long-term correlations in the data. This approach appears particularly promising for Recurrent Neural Networks (RNNs). In this work, we present a new architecture for implementing an Efficient Unitary Neural Network (EUNNs); its main advantages can be summarized as follows. Firstly, the representation capacity of the unitary space in an EUNN is fully tunable, ranging from a subspace of SU(N) to the entire unitary space. Secondly, the computational complexity for training an EUNN is merely $\mathcal{O}(1)$ per parameter. Finally, we test the performance of EUNNs on the standard copying task, the pixel-permuted MNIST digit recognition benchmark as well as the Speech Prediction Test (TIMIT). We find that our architecture significantly outperforms both other state-of-the-art unitary RNNs and the LSTM architecture, in terms of the final performance and/or the wall-clock training speed. EUNNs are thus promising alternatives to RNNs and LSTMs for a wide variety of applications.}
}

Endnote

%0 Conference Paper
%T Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs
%A Li Jing
%A Yichen Shen
%A Tena Dubcek
%A John Peurifoy
%A Scott Skirlo
%A Yann LeCun
%A Max Tegmark
%A Marin Soljačić
%B Proceedings of the 34th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2017
%E Doina Precup
%E Yee Whye Teh	
%F pmlr-v70-jing17a
%I PMLR
%P 1733--1741
%U https://proceedings.mlr.press/v70/jing17a.html
%V 70
%X Using unitary (instead of general) matrices in artificial neural networks (ANNs) is a promising way to solve the gradient explosion/vanishing problem, as well as to enable ANNs to learn long-term correlations in the data. This approach appears particularly promising for Recurrent Neural Networks (RNNs). In this work, we present a new architecture for implementing an Efficient Unitary Neural Network (EUNNs); its main advantages can be summarized as follows. Firstly, the representation capacity of the unitary space in an EUNN is fully tunable, ranging from a subspace of SU(N) to the entire unitary space. Secondly, the computational complexity for training an EUNN is merely $\mathcal{O}(1)$ per parameter. Finally, we test the performance of EUNNs on the standard copying task, the pixel-permuted MNIST digit recognition benchmark as well as the Speech Prediction Test (TIMIT). We find that our architecture significantly outperforms both other state-of-the-art unitary RNNs and the LSTM architecture, in terms of the final performance and/or the wall-clock training speed. EUNNs are thus promising alternatives to RNNs and LSTMs for a wide variety of applications.

APA


Jing, L., Shen, Y., Dubcek, T., Peurifoy, J., Skirlo, S., LeCun, Y., Tegmark, M. & Soljačić, M.. (2017). Tunable Efficient Unitary Neural Networks (EUNN) and their application to RNNs. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:1733-1741 Available from https://proceedings.mlr.press/v70/jing17a.html.

Related Material

Download PDF