Relative Fisher Information and Natural Gradient for Learning Large Modular Models

Ke Sun, Frank Nielsen
Proceedings of the 34th International Conference on Machine Learning, PMLR 70:3289-3298, 2017.

Abstract

Fisher information and natural gradient provided deep insights and powerful tools to artificial neural networks. However related analysis becomes more and more difficult as the learner’s structure turns large and complex. This paper makes a preliminary step towards a new direction. We extract a local component from a large neural system, and define its relative Fisher information metric that describes accurately this small component, and is invariant to the other parts of the system. This concept is important because the geometry structure is much simplified and it can be easily applied to guide the learning of neural networks. We provide an analysis on a list of commonly used components, and demonstrate how to use this concept to further improve optimization.

Cite this Paper


BibTeX
@InProceedings{pmlr-v70-sun17b, title = {Relative {F}isher Information and Natural Gradient for Learning Large Modular Models}, author = {Ke Sun and Frank Nielsen}, booktitle = {Proceedings of the 34th International Conference on Machine Learning}, pages = {3289--3298}, year = {2017}, editor = {Precup, Doina and Teh, Yee Whye}, volume = {70}, series = {Proceedings of Machine Learning Research}, month = {06--11 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v70/sun17b/sun17b.pdf}, url = {https://proceedings.mlr.press/v70/sun17b.html}, abstract = {Fisher information and natural gradient provided deep insights and powerful tools to artificial neural networks. However related analysis becomes more and more difficult as the learner’s structure turns large and complex. This paper makes a preliminary step towards a new direction. We extract a local component from a large neural system, and define its relative Fisher information metric that describes accurately this small component, and is invariant to the other parts of the system. This concept is important because the geometry structure is much simplified and it can be easily applied to guide the learning of neural networks. We provide an analysis on a list of commonly used components, and demonstrate how to use this concept to further improve optimization.} }
Endnote
%0 Conference Paper %T Relative Fisher Information and Natural Gradient for Learning Large Modular Models %A Ke Sun %A Frank Nielsen %B Proceedings of the 34th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2017 %E Doina Precup %E Yee Whye Teh %F pmlr-v70-sun17b %I PMLR %P 3289--3298 %U https://proceedings.mlr.press/v70/sun17b.html %V 70 %X Fisher information and natural gradient provided deep insights and powerful tools to artificial neural networks. However related analysis becomes more and more difficult as the learner’s structure turns large and complex. This paper makes a preliminary step towards a new direction. We extract a local component from a large neural system, and define its relative Fisher information metric that describes accurately this small component, and is invariant to the other parts of the system. This concept is important because the geometry structure is much simplified and it can be easily applied to guide the learning of neural networks. We provide an analysis on a list of commonly used components, and demonstrate how to use this concept to further improve optimization.
APA
Sun, K. & Nielsen, F.. (2017). Relative Fisher Information and Natural Gradient for Learning Large Modular Models. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:3289-3298 Available from https://proceedings.mlr.press/v70/sun17b.html.

Related Material