Structured Logconcave Sampling with a Restricted Gaussian Oracle

Yin Tat Lee, Ruoqi Shen, Kevin Tian
Proceedings of Thirty Fourth Conference on Learning Theory, PMLR 134:2993-3050, 2021.

Abstract

We give algorithms for sampling several structured logconcave families to high accuracy. We further develop a reduction framework, inspired by proximal point methods in convex optimization, which bootstraps samplers for regularized densities to generically improve dependences on problem conditioning κ from polynomial to linear. A key ingredient in our framework is the notion of a "restricted Gaussian oracle" (RGO) for g:RdR, which is a sampler for distributions whose negative log-likelihood sums a quadratic (in a multiple of the identity) and g. By combining our reduction framework with our new samplers, we obtain the following bounds for sampling structured distributions to total variation distance \eps. For composite densities exp(f(x)g(x)), where f has condition number κ and convex (but possibly non-smooth) g admits an RGO, we obtain a mixing time of O(κdlog3κdϵ), matching the state-of-the-art non-composite bound Lee et.\ al.\. No composite samplers with better mixing than general-purpose logconcave samplers were previously known. For logconcave finite sums exp(F(x)), where F(x)=1ni[n]fi(x) has condition number κ, we give a sampler querying ˜O(n+κmax gradient oracles\footnote{For convenience of exposition, the \widetilde{O} notation hides logarithmic factors in the dimension d, problem conditioning \kappa, desired accuracy \epsilon, and summand count n (when applicable). A first-order (gradient) oracle for f:\mathbb{R}^d \rightarrow \mathbb{R} returns (f(x), \nabla f(x)) on input x, and a zeroth-order (value) oracle only returns f(x).} to \{f_i\}_{i \in [n]}. No high-accuracy samplers with nontrivial gradient query complexity were previously known. For densities with condition number \kappa, we give an algorithm obtaining mixing time O(\kappa d \log^2\tfrac{\kappa d}{\epsilon}), improving Lee et.\ al.\ by a logarithmic factor with a significantly simpler analysis. We also show a zeroth-order algorithm attains the same query complexity.

Cite this Paper


BibTeX
@InProceedings{pmlr-v134-lee21a, title = {Structured Logconcave Sampling with a Restricted Gaussian Oracle}, author = {Lee, Yin Tat and Shen, Ruoqi and Tian, Kevin}, booktitle = {Proceedings of Thirty Fourth Conference on Learning Theory}, pages = {2993--3050}, year = {2021}, editor = {Belkin, Mikhail and Kpotufe, Samory}, volume = {134}, series = {Proceedings of Machine Learning Research}, month = {15--19 Aug}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v134/lee21a/lee21a.pdf}, url = {https://proceedings.mlr.press/v134/lee21a.html}, abstract = {We give algorithms for sampling several structured logconcave families to high accuracy. We further develop a reduction framework, inspired by proximal point methods in convex optimization, which bootstraps samplers for regularized densities to generically improve dependences on problem conditioning $\kappa$ from polynomial to linear. A key ingredient in our framework is the notion of a "restricted Gaussian oracle" (RGO) for $g: \mathbb{R}^d \rightarrow \mathbb{R}$, which is a sampler for distributions whose negative log-likelihood sums a quadratic (in a multiple of the identity) and $g$. By combining our reduction framework with our new samplers, we obtain the following bounds for sampling structured distributions to total variation distance $\eps$. For composite densities $\exp(-f(x) - g(x))$, where $f$ has condition number $\kappa$ and convex (but possibly non-smooth) $g$ admits an RGO, we obtain a mixing time of $O(\kappa d \log^3\tfrac{\kappa d}{\epsilon})$, matching the state-of-the-art non-composite bound Lee et.\ al.\. No composite samplers with better mixing than general-purpose logconcave samplers were previously known. For logconcave finite sums $\exp(-F(x))$, where $F(x) = \tfrac{1}{n}\sum_{i \in [n]} f_i(x)$ has condition number $\kappa$, we give a sampler querying $\widetilde{O}(n + \kappa\max(d, \sqrt{nd}))$ gradient oracles\footnote{For convenience of exposition, the $\widetilde{O}$ notation hides logarithmic factors in the dimension $d$, problem conditioning $\kappa$, desired accuracy $\epsilon$, and summand count $n$ (when applicable). A first-order (gradient) oracle for $f:\mathbb{R}^d \rightarrow \mathbb{R}$ returns $(f(x), \nabla f(x))$ on input $x$, and a zeroth-order (value) oracle only returns $f(x)$.} to $\{f_i\}_{i \in [n]}$. No high-accuracy samplers with nontrivial gradient query complexity were previously known. For densities with condition number $\kappa$, we give an algorithm obtaining mixing time $O(\kappa d \log^2\tfrac{\kappa d}{\epsilon})$, improving Lee et.\ al.\ by a logarithmic factor with a significantly simpler analysis. We also show a zeroth-order algorithm attains the same query complexity.} }
Endnote
%0 Conference Paper %T Structured Logconcave Sampling with a Restricted Gaussian Oracle %A Yin Tat Lee %A Ruoqi Shen %A Kevin Tian %B Proceedings of Thirty Fourth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2021 %E Mikhail Belkin %E Samory Kpotufe %F pmlr-v134-lee21a %I PMLR %P 2993--3050 %U https://proceedings.mlr.press/v134/lee21a.html %V 134 %X We give algorithms for sampling several structured logconcave families to high accuracy. We further develop a reduction framework, inspired by proximal point methods in convex optimization, which bootstraps samplers for regularized densities to generically improve dependences on problem conditioning $\kappa$ from polynomial to linear. A key ingredient in our framework is the notion of a "restricted Gaussian oracle" (RGO) for $g: \mathbb{R}^d \rightarrow \mathbb{R}$, which is a sampler for distributions whose negative log-likelihood sums a quadratic (in a multiple of the identity) and $g$. By combining our reduction framework with our new samplers, we obtain the following bounds for sampling structured distributions to total variation distance $\eps$. For composite densities $\exp(-f(x) - g(x))$, where $f$ has condition number $\kappa$ and convex (but possibly non-smooth) $g$ admits an RGO, we obtain a mixing time of $O(\kappa d \log^3\tfrac{\kappa d}{\epsilon})$, matching the state-of-the-art non-composite bound Lee et.\ al.\. No composite samplers with better mixing than general-purpose logconcave samplers were previously known. For logconcave finite sums $\exp(-F(x))$, where $F(x) = \tfrac{1}{n}\sum_{i \in [n]} f_i(x)$ has condition number $\kappa$, we give a sampler querying $\widetilde{O}(n + \kappa\max(d, \sqrt{nd}))$ gradient oracles\footnote{For convenience of exposition, the $\widetilde{O}$ notation hides logarithmic factors in the dimension $d$, problem conditioning $\kappa$, desired accuracy $\epsilon$, and summand count $n$ (when applicable). A first-order (gradient) oracle for $f:\mathbb{R}^d \rightarrow \mathbb{R}$ returns $(f(x), \nabla f(x))$ on input $x$, and a zeroth-order (value) oracle only returns $f(x)$.} to $\{f_i\}_{i \in [n]}$. No high-accuracy samplers with nontrivial gradient query complexity were previously known. For densities with condition number $\kappa$, we give an algorithm obtaining mixing time $O(\kappa d \log^2\tfrac{\kappa d}{\epsilon})$, improving Lee et.\ al.\ by a logarithmic factor with a significantly simpler analysis. We also show a zeroth-order algorithm attains the same query complexity.
APA
Lee, Y.T., Shen, R. & Tian, K.. (2021). Structured Logconcave Sampling with a Restricted Gaussian Oracle. Proceedings of Thirty Fourth Conference on Learning Theory, in Proceedings of Machine Learning Research 134:2993-3050 Available from https://proceedings.mlr.press/v134/lee21a.html.

Related Material