Scalable Gradients for Stochastic Differential Equations

Xuechen Li, Ting-Kam Leonard Wong, Ricky T. Q. Chen, David Duvenaud
Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, PMLR 108:3870-3882, 2020.

Abstract

The adjoint sensitivity method scalably computes gradients of solutions to ordinary differential equations. We generalize this method to stochastic differential equations, allowing time-efficient and constant-memory computation of gradients with high-order adaptive solvers. Specifically, we derive a stochastic differentialequation whose solution is the gradient, a memory-efficient algorithm for cachingnoise, and conditions under which numerical solutions converge. In addition, we combine our method with gradient-based stochastic variational inference for latent stochastic differential equations. We use our method to fit stochastic dynamics defined by neural networks, achieving competitive performance ona 50-dimensional motion capture dataset.

Cite this Paper


BibTeX
@InProceedings{pmlr-v108-li20i, title = {Scalable Gradients for Stochastic Differential Equations}, author = {Li, Xuechen and Wong, Ting-Kam Leonard and Chen, Ricky T. Q. and Duvenaud, David}, booktitle = {Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics}, pages = {3870--3882}, year = {2020}, editor = {Chiappa, Silvia and Calandra, Roberto}, volume = {108}, series = {Proceedings of Machine Learning Research}, month = {26--28 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v108/li20i/li20i.pdf}, url = {http://proceedings.mlr.press/v108/li20i.html}, abstract = {The adjoint sensitivity method scalably computes gradients of solutions to ordinary differential equations. We generalize this method to stochastic differential equations, allowing time-efficient and constant-memory computation of gradients with high-order adaptive solvers. Specifically, we derive a stochastic differentialequation whose solution is the gradient, a memory-efficient algorithm for cachingnoise, and conditions under which numerical solutions converge. In addition, we combine our method with gradient-based stochastic variational inference for latent stochastic differential equations. We use our method to fit stochastic dynamics defined by neural networks, achieving competitive performance ona 50-dimensional motion capture dataset.} }
Endnote
%0 Conference Paper %T Scalable Gradients for Stochastic Differential Equations %A Xuechen Li %A Ting-Kam Leonard Wong %A Ricky T. Q. Chen %A David Duvenaud %B Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2020 %E Silvia Chiappa %E Roberto Calandra %F pmlr-v108-li20i %I PMLR %P 3870--3882 %U http://proceedings.mlr.press/v108/li20i.html %V 108 %X The adjoint sensitivity method scalably computes gradients of solutions to ordinary differential equations. We generalize this method to stochastic differential equations, allowing time-efficient and constant-memory computation of gradients with high-order adaptive solvers. Specifically, we derive a stochastic differentialequation whose solution is the gradient, a memory-efficient algorithm for cachingnoise, and conditions under which numerical solutions converge. In addition, we combine our method with gradient-based stochastic variational inference for latent stochastic differential equations. We use our method to fit stochastic dynamics defined by neural networks, achieving competitive performance ona 50-dimensional motion capture dataset.
APA
Li, X., Wong, T.L., Chen, R.T.Q. & Duvenaud, D.. (2020). Scalable Gradients for Stochastic Differential Equations. Proceedings of the Twenty Third International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 108:3870-3882 Available from http://proceedings.mlr.press/v108/li20i.html.

Related Material