Scaling up Natural Gradient by Sparsely Factorizing the Inverse Fisher Matrix

Roger Grosse; Ruslan Salakhudinov

Scaling up Natural Gradient by Sparsely Factorizing the Inverse Fisher Matrix

Roger Grosse, Ruslan Salakhudinov

Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:2304-2313, 2015.

Abstract

Second-order optimization methods, such as natural gradient, are difficult to apply to high-dimensional problems, because they require approximately solving large linear systems. We present FActorized Natural Gradient (FANG), an approximation to natural gradient descent where the Fisher matrix is approximated with a Gaussian graphical model whose precision matrix can be computed efficiently. We analyze the Fisher matrix for a small RBM and derive an extremely sparse graphical model which is a good match to the covariance of the sufficient statistics. Our experiments indicate that FANG allows RBMs to be trained more efficiently compared with stochastic gradient descent. Additionally, our analysis yields insight into the surprisingly good performance of the “centering trick” for training RBMs.

Cite this Paper

BibTeX


@InProceedings{pmlr-v37-grosse15,
  title = 	 {Scaling up Natural Gradient by Sparsely Factorizing the Inverse Fisher Matrix},
  author = 	 {Grosse, Roger and Salakhudinov, Ruslan},
  booktitle = 	 {Proceedings of the 32nd International Conference on Machine Learning},
  pages = 	 {2304--2313},
  year = 	 {2015},
  editor = 	 {Bach, Francis and Blei, David},
  volume = 	 {37},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Lille, France},
  month = 	 {07--09 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v37/grosse15.pdf},
  url = 	 {https://proceedings.mlr.press/v37/grosse15.html},
  abstract = 	 {Second-order optimization methods, such as natural gradient, are difficult to apply to high-dimensional problems, because they require approximately solving large linear systems. We present FActorized Natural Gradient (FANG), an approximation to natural gradient descent where the Fisher matrix is approximated with a Gaussian graphical model whose precision matrix can be computed efficiently. We analyze the Fisher matrix for a small RBM and derive an extremely sparse graphical model which is a good match to the covariance of the sufficient statistics. Our experiments indicate that FANG allows RBMs to be trained more efficiently compared with stochastic gradient descent. Additionally, our analysis yields insight into the surprisingly good performance of the “centering trick” for training RBMs.}
}

Endnote

%0 Conference Paper
%T Scaling up Natural Gradient by Sparsely Factorizing the Inverse Fisher Matrix
%A Roger Grosse
%A Ruslan Salakhudinov
%B Proceedings of the 32nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2015
%E Francis Bach
%E David Blei	
%F pmlr-v37-grosse15
%I PMLR
%P 2304--2313
%U https://proceedings.mlr.press/v37/grosse15.html
%V 37
%X Second-order optimization methods, such as natural gradient, are difficult to apply to high-dimensional problems, because they require approximately solving large linear systems. We present FActorized Natural Gradient (FANG), an approximation to natural gradient descent where the Fisher matrix is approximated with a Gaussian graphical model whose precision matrix can be computed efficiently. We analyze the Fisher matrix for a small RBM and derive an extremely sparse graphical model which is a good match to the covariance of the sufficient statistics. Our experiments indicate that FANG allows RBMs to be trained more efficiently compared with stochastic gradient descent. Additionally, our analysis yields insight into the surprisingly good performance of the “centering trick” for training RBMs.

RIS


TY  - CPAPER
TI  - Scaling up Natural Gradient by Sparsely Factorizing the Inverse Fisher Matrix
AU  - Roger Grosse
AU  - Ruslan Salakhudinov
BT  - Proceedings of the 32nd International Conference on Machine Learning
DA  - 2015/06/01
ED  - Francis Bach
ED  - David Blei	
ID  - pmlr-v37-grosse15
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 37
SP  - 2304
EP  - 2313
L1  - http://proceedings.mlr.press/v37/grosse15.pdf
UR  - https://proceedings.mlr.press/v37/grosse15.html
AB  - Second-order optimization methods, such as natural gradient, are difficult to apply to high-dimensional problems, because they require approximately solving large linear systems. We present FActorized Natural Gradient (FANG), an approximation to natural gradient descent where the Fisher matrix is approximated with a Gaussian graphical model whose precision matrix can be computed efficiently. We analyze the Fisher matrix for a small RBM and derive an extremely sparse graphical model which is a good match to the covariance of the sufficient statistics. Our experiments indicate that FANG allows RBMs to be trained more efficiently compared with stochastic gradient descent. Additionally, our analysis yields insight into the surprisingly good performance of the “centering trick” for training RBMs.
ER  -

APA


Grosse, R. & Salakhudinov, R.. (2015). Scaling up Natural Gradient by Sparsely Factorizing the Inverse Fisher Matrix. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:2304-2313 Available from https://proceedings.mlr.press/v37/grosse15.html.

Related Material

Download PDF