Variational (Gradient) Estimate of the Score Function in Energy-based Latent Variable Models

Fan Bao; Kun Xu; Chongxuan Li; Lanqing Hong; Jun Zhu; Bo Zhang

Variational (Gradient) Estimate of the Score Function in Energy-based Latent Variable Models

Fan Bao, Kun Xu, Chongxuan Li, Lanqing Hong, Jun Zhu, Bo Zhang

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:651-661, 2021.

Abstract

This paper presents new estimates of the score function and its gradient with respect to the model parameters in a general energy-based latent variable model (EBLVM). The score function and its gradient can be expressed as combinations of expectation and covariance terms over the (generally intractable) posterior of the latent variables. New estimates are obtained by introducing a variational posterior to approximate the true posterior in these terms. The variational posterior is trained to minimize a certain divergence (e.g., the KL divergence) between itself and the true posterior. Theoretically, the divergence characterizes upper bounds of the bias of the estimates. In principle, our estimates can be applied to a wide range of objectives, including kernelized Stein discrepancy (KSD), score matching (SM)-based methods and exact Fisher divergence with a minimal model assumption. In particular, these estimates applied to SM-based methods outperform existing methods in learning EBLVMs on several image datasets.

Cite this Paper

BibTeX

@InProceedings{pmlr-v139-bao21b,
  title = 	 {Variational (Gradient) Estimate of the Score Function in Energy-based Latent Variable Models},
  author =       {Bao, Fan and Xu, Kun and Li, Chongxuan and Hong, Lanqing and Zhu, Jun and Zhang, Bo},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {651--661},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/bao21b/bao21b.pdf},
  url = 	 {https://proceedings.mlr.press/v139/bao21b.html},
  abstract = 	 {This paper presents new estimates of the score function and its gradient with respect to the model parameters in a general energy-based latent variable model (EBLVM). The score function and its gradient can be expressed as combinations of expectation and covariance terms over the (generally intractable) posterior of the latent variables. New estimates are obtained by introducing a variational posterior to approximate the true posterior in these terms. The variational posterior is trained to minimize a certain divergence (e.g., the KL divergence) between itself and the true posterior. Theoretically, the divergence characterizes upper bounds of the bias of the estimates. In principle, our estimates can be applied to a wide range of objectives, including kernelized Stein discrepancy (KSD), score matching (SM)-based methods and exact Fisher divergence with a minimal model assumption. In particular, these estimates applied to SM-based methods outperform existing methods in learning EBLVMs on several image datasets.}
}

Endnote

%0 Conference Paper
%T Variational (Gradient) Estimate of the Score Function in Energy-based Latent Variable Models
%A Fan Bao
%A Kun Xu
%A Chongxuan Li
%A Lanqing Hong
%A Jun Zhu
%A Bo Zhang
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-bao21b
%I PMLR
%P 651--661
%U https://proceedings.mlr.press/v139/bao21b.html
%V 139
%X This paper presents new estimates of the score function and its gradient with respect to the model parameters in a general energy-based latent variable model (EBLVM). The score function and its gradient can be expressed as combinations of expectation and covariance terms over the (generally intractable) posterior of the latent variables. New estimates are obtained by introducing a variational posterior to approximate the true posterior in these terms. The variational posterior is trained to minimize a certain divergence (e.g., the KL divergence) between itself and the true posterior. Theoretically, the divergence characterizes upper bounds of the bias of the estimates. In principle, our estimates can be applied to a wide range of objectives, including kernelized Stein discrepancy (KSD), score matching (SM)-based methods and exact Fisher divergence with a minimal model assumption. In particular, these estimates applied to SM-based methods outperform existing methods in learning EBLVMs on several image datasets.

APA

Bao, F., Xu, K., Li, C., Hong, L., Zhu, J. & Zhang, B.. (2021). Variational (Gradient) Estimate of the Score Function in Energy-based Latent Variable Models. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:651-661 Available from https://proceedings.mlr.press/v139/bao21b.html.

Variational (Gradient) Estimate of the Score Function in Energy-based Latent Variable Models

Abstract

Cite this Paper

Related Material