Scaling up Natural Gradient by Sparsely Factorizing the Inverse Fisher Matrix

Roger Grosse, Ruslan Salakhudinov
Proceedings of the 32nd International Conference on Machine Learning, PMLR 37:2304-2313, 2015.

Abstract

Second-order optimization methods, such as natural gradient, are difficult to apply to high-dimensional problems, because they require approximately solving large linear systems. We present FActorized Natural Gradient (FANG), an approximation to natural gradient descent where the Fisher matrix is approximated with a Gaussian graphical model whose precision matrix can be computed efficiently. We analyze the Fisher matrix for a small RBM and derive an extremely sparse graphical model which is a good match to the covariance of the sufficient statistics. Our experiments indicate that FANG allows RBMs to be trained more efficiently compared with stochastic gradient descent. Additionally, our analysis yields insight into the surprisingly good performance of the “centering trick” for training RBMs.

Cite this Paper


BibTeX
@InProceedings{pmlr-v37-grosse15, title = {Scaling up Natural Gradient by Sparsely Factorizing the Inverse Fisher Matrix}, author = {Grosse, Roger and Salakhudinov, Ruslan}, booktitle = {Proceedings of the 32nd International Conference on Machine Learning}, pages = {2304--2313}, year = {2015}, editor = {Bach, Francis and Blei, David}, volume = {37}, series = {Proceedings of Machine Learning Research}, address = {Lille, France}, month = {07--09 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v37/grosse15.pdf}, url = { http://proceedings.mlr.press/v37/grosse15.html }, abstract = {Second-order optimization methods, such as natural gradient, are difficult to apply to high-dimensional problems, because they require approximately solving large linear systems. We present FActorized Natural Gradient (FANG), an approximation to natural gradient descent where the Fisher matrix is approximated with a Gaussian graphical model whose precision matrix can be computed efficiently. We analyze the Fisher matrix for a small RBM and derive an extremely sparse graphical model which is a good match to the covariance of the sufficient statistics. Our experiments indicate that FANG allows RBMs to be trained more efficiently compared with stochastic gradient descent. Additionally, our analysis yields insight into the surprisingly good performance of the “centering trick” for training RBMs.} }
Endnote
%0 Conference Paper %T Scaling up Natural Gradient by Sparsely Factorizing the Inverse Fisher Matrix %A Roger Grosse %A Ruslan Salakhudinov %B Proceedings of the 32nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2015 %E Francis Bach %E David Blei %F pmlr-v37-grosse15 %I PMLR %P 2304--2313 %U http://proceedings.mlr.press/v37/grosse15.html %V 37 %X Second-order optimization methods, such as natural gradient, are difficult to apply to high-dimensional problems, because they require approximately solving large linear systems. We present FActorized Natural Gradient (FANG), an approximation to natural gradient descent where the Fisher matrix is approximated with a Gaussian graphical model whose precision matrix can be computed efficiently. We analyze the Fisher matrix for a small RBM and derive an extremely sparse graphical model which is a good match to the covariance of the sufficient statistics. Our experiments indicate that FANG allows RBMs to be trained more efficiently compared with stochastic gradient descent. Additionally, our analysis yields insight into the surprisingly good performance of the “centering trick” for training RBMs.
RIS
TY - CPAPER TI - Scaling up Natural Gradient by Sparsely Factorizing the Inverse Fisher Matrix AU - Roger Grosse AU - Ruslan Salakhudinov BT - Proceedings of the 32nd International Conference on Machine Learning DA - 2015/06/01 ED - Francis Bach ED - David Blei ID - pmlr-v37-grosse15 PB - PMLR DP - Proceedings of Machine Learning Research VL - 37 SP - 2304 EP - 2313 L1 - http://proceedings.mlr.press/v37/grosse15.pdf UR - http://proceedings.mlr.press/v37/grosse15.html AB - Second-order optimization methods, such as natural gradient, are difficult to apply to high-dimensional problems, because they require approximately solving large linear systems. We present FActorized Natural Gradient (FANG), an approximation to natural gradient descent where the Fisher matrix is approximated with a Gaussian graphical model whose precision matrix can be computed efficiently. We analyze the Fisher matrix for a small RBM and derive an extremely sparse graphical model which is a good match to the covariance of the sufficient statistics. Our experiments indicate that FANG allows RBMs to be trained more efficiently compared with stochastic gradient descent. Additionally, our analysis yields insight into the surprisingly good performance of the “centering trick” for training RBMs. ER -
APA
Grosse, R. & Salakhudinov, R.. (2015). Scaling up Natural Gradient by Sparsely Factorizing the Inverse Fisher Matrix. Proceedings of the 32nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 37:2304-2313 Available from http://proceedings.mlr.press/v37/grosse15.html .

Related Material