Energetic Natural Gradient Descent

Philip Thomas; Bruno Castro Silva; Christoph Dann; Emma Brunskill

Energetic Natural Gradient Descent

Philip Thomas, Bruno Castro Silva, Christoph Dann, Emma Brunskill

Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:2887-2895, 2016.

Abstract

We propose a new class of algorithms for minimizing or maximizing functions of parametric probabilistic models. These new algorithms are natural gradient algorithms that leverage more information than prior methods by using a new metric tensor in place of the commonly used Fisher information matrix. This new metric tensor is derived by computing directions of steepest ascent where the distance between distributions is measured using an approximation of energy distance (as opposed to Kullback-Leibler divergence, which produces the Fisher information matrix), and so we refer to our new ascent direction as the energetic natural gradient.

Cite this Paper

BibTeX


@InProceedings{pmlr-v48-thomasb16,
  title = 	 {Energetic Natural Gradient Descent},
  author = 	 {Thomas, Philip and Silva, Bruno Castro and Dann, Christoph and Brunskill, Emma},
  booktitle = 	 {Proceedings of The 33rd International Conference on Machine Learning},
  pages = 	 {2887--2895},
  year = 	 {2016},
  editor = 	 {Balcan, Maria Florina and Weinberger, Kilian Q.},
  volume = 	 {48},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {New York, New York, USA},
  month = 	 {20--22 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v48/thomasb16.pdf},
  url = 	 {https://proceedings.mlr.press/v48/thomasb16.html},
  abstract = 	 {We propose a new class of algorithms for minimizing or maximizing functions of parametric probabilistic models. These new algorithms are natural gradient algorithms that leverage more information than prior methods by using a new metric tensor in place of the commonly used Fisher information matrix. This new metric tensor is derived by computing directions of steepest ascent where the distance between distributions is measured using an approximation of energy distance (as opposed to Kullback-Leibler divergence, which produces the Fisher information matrix), and so we refer to our new ascent direction as the energetic natural gradient.}
}

Endnote

%0 Conference Paper
%T Energetic Natural Gradient Descent
%A Philip Thomas
%A Bruno Castro Silva
%A Christoph Dann
%A Emma Brunskill
%B Proceedings of The 33rd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2016
%E Maria Florina Balcan
%E Kilian Q. Weinberger	
%F pmlr-v48-thomasb16
%I PMLR
%P 2887--2895
%U https://proceedings.mlr.press/v48/thomasb16.html
%V 48
%X We propose a new class of algorithms for minimizing or maximizing functions of parametric probabilistic models. These new algorithms are natural gradient algorithms that leverage more information than prior methods by using a new metric tensor in place of the commonly used Fisher information matrix. This new metric tensor is derived by computing directions of steepest ascent where the distance between distributions is measured using an approximation of energy distance (as opposed to Kullback-Leibler divergence, which produces the Fisher information matrix), and so we refer to our new ascent direction as the energetic natural gradient.

RIS


TY  - CPAPER
TI  - Energetic Natural Gradient Descent
AU  - Philip Thomas
AU  - Bruno Castro Silva
AU  - Christoph Dann
AU  - Emma Brunskill
BT  - Proceedings of The 33rd International Conference on Machine Learning
DA  - 2016/06/11
ED  - Maria Florina Balcan
ED  - Kilian Q. Weinberger	
ID  - pmlr-v48-thomasb16
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 48
SP  - 2887
EP  - 2895
L1  - http://proceedings.mlr.press/v48/thomasb16.pdf
UR  - https://proceedings.mlr.press/v48/thomasb16.html
AB  - We propose a new class of algorithms for minimizing or maximizing functions of parametric probabilistic models. These new algorithms are natural gradient algorithms that leverage more information than prior methods by using a new metric tensor in place of the commonly used Fisher information matrix. This new metric tensor is derived by computing directions of steepest ascent where the distance between distributions is measured using an approximation of energy distance (as opposed to Kullback-Leibler divergence, which produces the Fisher information matrix), and so we refer to our new ascent direction as the energetic natural gradient.
ER  -

APA


Thomas, P., Silva, B.C., Dann, C. & Brunskill, E.. (2016). Energetic Natural Gradient Descent. Proceedings of The 33rd International Conference on Machine Learning, in Proceedings of Machine Learning Research 48:2887-2895 Available from https://proceedings.mlr.press/v48/thomasb16.html.

Energetic Natural Gradient Descent

Abstract

Cite this Paper

Related Material