Stochastic Gradient Hamiltonian Monte Carlo

Tianqi Chen; Emily Fox; Carlos Guestrin

Stochastic Gradient Hamiltonian Monte Carlo

Tianqi Chen, Emily Fox, Carlos Guestrin

Proceedings of the 31st International Conference on Machine Learning, PMLR 32(2):1683-1691, 2014.

Abstract

Hamiltonian Monte Carlo (HMC) sampling methods provide a mechanism for defining distant proposals with high acceptance probabilities in a Metropolis-Hastings framework, enabling more efficient exploration of the state space than standard random-walk proposals. The popularity of such methods has grown significantly in recent years. However, a limitation of HMC methods is the required gradient computation for simulation of the Hamiltonian dynamical system-such computation is infeasible in problems involving a large sample size or streaming data. Instead, we must rely on a noisy gradient estimate computed from a subset of the data. In this paper, we explore the properties of such a stochastic gradient HMC approach. Surprisingly, the natural implementation of the stochastic approximation can be arbitrarily bad. To address this problem we introduce a variant that uses second-order Langevin dynamics with a friction term that counteracts the effects of the noisy gradient, maintaining the desired target distribution as the invariant distribution. Results on simulated data validate our theory. We also provide an application of our methods to a classification task using neural networks and to online Bayesian matrix factorization.

Cite this Paper

BibTeX


@InProceedings{pmlr-v32-cheni14,
  title = 	 {Stochastic Gradient Hamiltonian Monte Carlo},
  author = 	 {Chen, Tianqi and Fox, Emily and Guestrin, Carlos},
  booktitle = 	 {Proceedings of the 31st International Conference on Machine Learning},
  pages = 	 {1683--1691},
  year = 	 {2014},
  editor = 	 {Xing, Eric P. and Jebara, Tony},
  volume = 	 {32},
  number =       {2},
  series = 	 {Proceedings of Machine Learning Research},
  address = 	 {Bejing, China},
  month = 	 {22--24 Jun},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v32/cheni14.pdf},
  url = 	 {https://proceedings.mlr.press/v32/cheni14.html},
  abstract = 	 {Hamiltonian Monte Carlo (HMC) sampling methods provide a mechanism for defining distant proposals with high acceptance probabilities in a Metropolis-Hastings framework, enabling more efficient exploration of the state space than standard random-walk proposals.  The popularity of such methods has grown significantly in recent years.  However, a limitation of HMC methods is the required gradient computation for simulation of the Hamiltonian dynamical system-such computation is infeasible in problems involving a large sample size or streaming data. Instead, we must rely on a noisy gradient estimate computed from a subset of the data.  In this paper, we explore the properties of such a stochastic gradient HMC approach. Surprisingly, the natural implementation of the stochastic approximation can be arbitrarily bad.  To address this problem we introduce a variant that uses second-order Langevin dynamics with a friction term that counteracts the effects of the noisy gradient, maintaining the desired target distribution as the invariant distribution.  Results on simulated data validate our theory.  We also provide an application of our methods to a classification task using neural networks and to online Bayesian matrix factorization.}
}

Endnote

%0 Conference Paper
%T Stochastic Gradient Hamiltonian Monte Carlo
%A Tianqi Chen
%A Emily Fox
%A Carlos Guestrin
%B Proceedings of the 31st International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2014
%E Eric P. Xing
%E Tony Jebara	
%F pmlr-v32-cheni14
%I PMLR
%P 1683--1691
%U https://proceedings.mlr.press/v32/cheni14.html
%V 32
%N 2
%X Hamiltonian Monte Carlo (HMC) sampling methods provide a mechanism for defining distant proposals with high acceptance probabilities in a Metropolis-Hastings framework, enabling more efficient exploration of the state space than standard random-walk proposals.  The popularity of such methods has grown significantly in recent years.  However, a limitation of HMC methods is the required gradient computation for simulation of the Hamiltonian dynamical system-such computation is infeasible in problems involving a large sample size or streaming data. Instead, we must rely on a noisy gradient estimate computed from a subset of the data.  In this paper, we explore the properties of such a stochastic gradient HMC approach. Surprisingly, the natural implementation of the stochastic approximation can be arbitrarily bad.  To address this problem we introduce a variant that uses second-order Langevin dynamics with a friction term that counteracts the effects of the noisy gradient, maintaining the desired target distribution as the invariant distribution.  Results on simulated data validate our theory.  We also provide an application of our methods to a classification task using neural networks and to online Bayesian matrix factorization.

RIS


TY  - CPAPER
TI  - Stochastic Gradient Hamiltonian Monte Carlo
AU  - Tianqi Chen
AU  - Emily Fox
AU  - Carlos Guestrin
BT  - Proceedings of the 31st International Conference on Machine Learning
DA  - 2014/06/18
ED  - Eric P. Xing
ED  - Tony Jebara	
ID  - pmlr-v32-cheni14
PB  - PMLR
DP  - Proceedings of Machine Learning Research
VL  - 32
IS  - 2
SP  - 1683
EP  - 1691
L1  - http://proceedings.mlr.press/v32/cheni14.pdf
UR  - https://proceedings.mlr.press/v32/cheni14.html
AB  - Hamiltonian Monte Carlo (HMC) sampling methods provide a mechanism for defining distant proposals with high acceptance probabilities in a Metropolis-Hastings framework, enabling more efficient exploration of the state space than standard random-walk proposals.  The popularity of such methods has grown significantly in recent years.  However, a limitation of HMC methods is the required gradient computation for simulation of the Hamiltonian dynamical system-such computation is infeasible in problems involving a large sample size or streaming data. Instead, we must rely on a noisy gradient estimate computed from a subset of the data.  In this paper, we explore the properties of such a stochastic gradient HMC approach. Surprisingly, the natural implementation of the stochastic approximation can be arbitrarily bad.  To address this problem we introduce a variant that uses second-order Langevin dynamics with a friction term that counteracts the effects of the noisy gradient, maintaining the desired target distribution as the invariant distribution.  Results on simulated data validate our theory.  We also provide an application of our methods to a classification task using neural networks and to online Bayesian matrix factorization.
ER  -

APA


Chen, T., Fox, E. & Guestrin, C.. (2014). Stochastic Gradient Hamiltonian Monte Carlo. Proceedings of the 31st International Conference on Machine Learning, in Proceedings of Machine Learning Research 32(2):1683-1691 Available from https://proceedings.mlr.press/v32/cheni14.html.

Stochastic Gradient Hamiltonian Monte Carlo

Abstract

Cite this Paper

Related Material