[edit]
Scalability Enhancement and Data-Heterogeneity Awareness in Gradient Tracking based Decentralized Bayesian Learning
Proceedings of the 7th Annual Learning for Dynamics \& Control Conference, PMLR 283:591-605, 2025.
Abstract
This paper proposes a Gradient Tracking Decentralized Unadjusted Langevin Algorithm (GT-DULA) to perform Bayesian learning via MCMC sampling. GT-DULA enhances the scalability of the process when compared with the conventional DULA as it reduces the dependence of the convergence bias on the network size by an order of magnitude for constant gradient step size. GT-DULA uses an estimate of the global gradient as a substitute for local gradients which is shared among neighbors in the network. Our theoretical analysis shows that the proposed GT-DULA successfully tracks the global gradient within a certain neighborhood, which leads to a two-fold benefit. First, the optimal mixing of the gradient estimates leads to a lower bias in convergence. Second, the successful tracking of the global gradient implies robustness towards data heterogeneity which is a major concern in decentralized learning.