[edit]
Contraction of Markovian Operators in Orlicz Spaces and Error Bounds for Markov Chain Monte Carlo (Extended Abstract)
Proceedings of Thirty Seventh Conference on Learning Theory, PMLR 247:1643-1645, 2024.
Abstract
We introduce a novel concept of convergence for Markovian processes within Orlicz spaces, extending beyond the conventional approach associated with $L_p$ spaces. After showing that Markovian operators are contractive in Orlicz spaces, our technical contribution is an upper bound on their contraction coefficient, which admits a closed-form expression. The bound is tight in some settings, and it recovers well-known results, such as the connection between contraction and ergodicity, ultra-mixing and Doeblin’s minorisation. Moreover, we can define a notion of convergence of Markov processes in Orlicz spaces, which depends on the corresponding contraction coefficient. The key novelty comes from duality considerations: the convergence of a Markovian process determined by $K$ depends on the contraction coefficient of its dual $K^\star$, which can in turn be bounded by considering appropriate nested norms of densities of $K^\star$ with respect to the stationary measure. Our approach stands out as the first of its kind, as it does not rely on the existence of a spectral gap. Specialising our approach to $L_p$ spaces leads to a significant improvement upon classical Riesz-Thorin’s interpolation methods. We present the following applications of the proposed framework: \begin{enumerate} \item Tighter bounds on the mixing time of Markovian processes: one can relate the contraction coefficient of the dual operator to the mixing time of the corresponding Markov chain regardless of the norm chosen. Consequently, our tighter bound on the contraction coefficient implies a tighter bound on the mixing time. We offer a result that provides an intuitive understanding of what it means to be close in a specific norm (relating the probability of any event with the probability of the same event under the stationary measure $\pi$ and a $\psi$-Orlicz/Amemiya-norm). We then focus on $L_p$ norms and show that asking for a bounded norm with larger $p$ guarantees a faster decay in the probability. This is particularly relevant for exponentially decaying probabilities under $\pi$. Moreover, by exploiting the flexibility offered by Orlicz spaces, we can tackle settings where the stationary distribution is heavy-tailed, a severely under-studied setup. \item Improved concentration bounds for MCMC methods leading to improved lower bounds on the burn-in period: by leveraging $L_p$-norms with large $p$ and our results on the contraction coefficient, similar to the approach undertaken for the mixing times, we can provide improved exponential concentration bounds for MCMC methods. \item Improved concentration bounds for sequences of Markovian random variables: we show how our results can be used to outperform existing bounds based on a change of measure technique for random variables with a Markovian dependence. In particular, we can prove exponential concentration in new settings (inaccessible to earlier approaches) and improve the rate in others. \end{enumerate}