Stein Variational Gradient Descent Without Gradient

Jun Han; Qiang Liu

Stein Variational Gradient Descent Without Gradient

Jun Han, Qiang Liu

Proceedings of the 35th International Conference on Machine Learning, PMLR 80:1900-1908, 2018.

Abstract

Stein variational gradient decent (SVGD) has been shown to be a powerful approximate inference algorithm for complex distributions. However, the standard SVGD requires calculating the gradient of the target density and cannot be applied when the gradient is unavailable. In this work, we develop a gradient-free variant of SVGD (GF-SVGD), which replaces the true gradient with a surrogate gradient, and corrects the introduced bias by re-weighting the gradients in a proper form. We show that our GF-SVGD can be viewed as the standard SVGD with a special choice of kernel, and hence directly inherits all the theoretical properties of SVGD. We shed insights on the empirical choice of the surrogate gradient and further, propose an annealed GF-SVGD that consistently outperforms a number of recent advanced gradient-free MCMC methods in our empirical studies.

Cite this Paper

BibTeX


@InProceedings{pmlr-v80-han18b,
  title = 	 {Stein Variational Gradient Descent Without Gradient},
  author =       {Han, Jun and Liu, Qiang},
  booktitle = 	 {Proceedings of the 35th International Conference on Machine Learning},
  pages = 	 {1900--1908},
  year = 	 {2018},
  editor = 	 {Dy, Jennifer and Krause, Andreas},
  volume = 	 {80},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {10--15 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v80/han18b/han18b.pdf},
  url = 	 {https://proceedings.mlr.press/v80/han18b.html},
  abstract = 	 {Stein variational gradient decent (SVGD) has been shown to be a powerful approximate inference algorithm for complex distributions. However, the standard SVGD requires calculating the gradient of the target density and cannot be applied when the gradient is unavailable. In this work, we develop a gradient-free variant of SVGD (GF-SVGD), which replaces the true gradient with a surrogate gradient, and corrects the introduced bias by re-weighting the gradients in a proper form. We show that our GF-SVGD can be viewed as the standard SVGD with a special choice of kernel, and hence directly inherits all the theoretical properties of SVGD. We shed insights on the empirical choice of the surrogate gradient and further, propose an annealed GF-SVGD that consistently outperforms a number of recent advanced gradient-free MCMC methods in our empirical studies.}
}

Endnote

%0 Conference Paper
%T Stein Variational Gradient Descent Without Gradient
%A Jun Han
%A Qiang Liu
%B Proceedings of the 35th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2018
%E Jennifer Dy
%E Andreas Krause	
%F pmlr-v80-han18b
%I PMLR
%P 1900--1908
%U https://proceedings.mlr.press/v80/han18b.html
%V 80
%X Stein variational gradient decent (SVGD) has been shown to be a powerful approximate inference algorithm for complex distributions. However, the standard SVGD requires calculating the gradient of the target density and cannot be applied when the gradient is unavailable. In this work, we develop a gradient-free variant of SVGD (GF-SVGD), which replaces the true gradient with a surrogate gradient, and corrects the introduced bias by re-weighting the gradients in a proper form. We show that our GF-SVGD can be viewed as the standard SVGD with a special choice of kernel, and hence directly inherits all the theoretical properties of SVGD. We shed insights on the empirical choice of the surrogate gradient and further, propose an annealed GF-SVGD that consistently outperforms a number of recent advanced gradient-free MCMC methods in our empirical studies.

APA


Han, J. & Liu, Q.. (2018). Stein Variational Gradient Descent Without Gradient. Proceedings of the 35th International Conference on Machine Learning, in Proceedings of Machine Learning Research 80:1900-1908 Available from https://proceedings.mlr.press/v80/han18b.html.

Stein Variational Gradient Descent Without Gradient

Abstract

Cite this Paper

Related Material