Convergence of Stein Variational Gradient Descent under a Weaker Smoothness Condition

Lukang Sun, Avetik Karagulyan, Peter Richtarik
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:3693-3717, 2023.

Abstract

Stein Variational Gradient Descent (SVGD) is an important alternative to the Langevin-type algorithms for sampling from probability distributions of the form $\pi(x) \propto \exp(-V(x))$. In the existing theory of Langevin-type algorithms and SVGD, the potential function $V$ is often assumed to be $L$-smooth. However, this restrictive condition excludes a large class of potential functions such as polynomials of degree greater than $2$. Our paper studies the convergence of the SVGD algorithm for distributions with $(L_0,L_1)$-smooth potentials. This relaxed smoothness assumption was introduced by Zhang et al. [2019a] for the analysis of gradient clipping algorithms. With the help of trajectory-independent auxiliary conditions, we provide a descent lemma establishing that the algorithm decreases the KL divergence at each iteration and prove a complexity bound for SVGD in the population limit in terms of the Stein Fisher information.

Cite this Paper


BibTeX
@InProceedings{pmlr-v206-sun23d, title = {Convergence of Stein Variational Gradient Descent under a Weaker Smoothness Condition}, author = {Sun, Lukang and Karagulyan, Avetik and Richtarik, Peter}, booktitle = {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics}, pages = {3693--3717}, year = {2023}, editor = {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem}, volume = {206}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v206/sun23d/sun23d.pdf}, url = {https://proceedings.mlr.press/v206/sun23d.html}, abstract = {Stein Variational Gradient Descent (SVGD) is an important alternative to the Langevin-type algorithms for sampling from probability distributions of the form $\pi(x) \propto \exp(-V(x))$. In the existing theory of Langevin-type algorithms and SVGD, the potential function $V$ is often assumed to be $L$-smooth. However, this restrictive condition excludes a large class of potential functions such as polynomials of degree greater than $2$. Our paper studies the convergence of the SVGD algorithm for distributions with $(L_0,L_1)$-smooth potentials. This relaxed smoothness assumption was introduced by Zhang et al. [2019a] for the analysis of gradient clipping algorithms. With the help of trajectory-independent auxiliary conditions, we provide a descent lemma establishing that the algorithm decreases the KL divergence at each iteration and prove a complexity bound for SVGD in the population limit in terms of the Stein Fisher information.} }
Endnote
%0 Conference Paper %T Convergence of Stein Variational Gradient Descent under a Weaker Smoothness Condition %A Lukang Sun %A Avetik Karagulyan %A Peter Richtarik %B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2023 %E Francisco Ruiz %E Jennifer Dy %E Jan-Willem van de Meent %F pmlr-v206-sun23d %I PMLR %P 3693--3717 %U https://proceedings.mlr.press/v206/sun23d.html %V 206 %X Stein Variational Gradient Descent (SVGD) is an important alternative to the Langevin-type algorithms for sampling from probability distributions of the form $\pi(x) \propto \exp(-V(x))$. In the existing theory of Langevin-type algorithms and SVGD, the potential function $V$ is often assumed to be $L$-smooth. However, this restrictive condition excludes a large class of potential functions such as polynomials of degree greater than $2$. Our paper studies the convergence of the SVGD algorithm for distributions with $(L_0,L_1)$-smooth potentials. This relaxed smoothness assumption was introduced by Zhang et al. [2019a] for the analysis of gradient clipping algorithms. With the help of trajectory-independent auxiliary conditions, we provide a descent lemma establishing that the algorithm decreases the KL divergence at each iteration and prove a complexity bound for SVGD in the population limit in terms of the Stein Fisher information.
APA
Sun, L., Karagulyan, A. & Richtarik, P.. (2023). Convergence of Stein Variational Gradient Descent under a Weaker Smoothness Condition. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:3693-3717 Available from https://proceedings.mlr.press/v206/sun23d.html.

Related Material