[edit]
Characterizing Dependence of Samples along the Langevin Dynamics and Algorithms via Contraction of $Φ$-Mutual Information (Extended Abstract)
Proceedings of Thirty Eighth Conference on Learning Theory, PMLR 291:3730-3731, 2025.
Abstract
The mixing time of a Markov chain determines how fast the iterates of the Markov chain converge to the stationary distribution; however, it does not control the dependencies between samples along the Markov chain. In this paper, we study the question of how fast the samples become approximately independent along popular Markov chains for continuous-space sampling: the Langevin dynamics in continuous time, and the Unadjusted Langevin Algorithm and the Proximal Sampler in discrete time. We measure the dependence between samples via $\Phi$-mutual information, which is a broad generalization of the standard mutual information, and which is equal to $0$ if and only if the the samples are independent. We show that along these Markov chains, the $\Phi$-mutual information between the first and the $k$-th iterate decreases to $0$ exponentially fast in $k$ when the target distribution is strongly log-concave. Our proof technique is based on showing the Strong Data Processing Inequalities (SDPIs) hold along the Markov chains. To prove fast mixing of the Markov chains, we only need to show the SDPIs hold for the stationary distribution. In contrast, to prove the contraction of $\Phi$-mutual information, we need to show the SDPIs hold along the entire trajectories of the Markov chains; we prove this when the iterates along the Markov chains satisfy the corresponding $\Phi$-Sobolev inequality, which is implied by the strong log-concavity of the target distribution.