Scalability Enhancement and Data-Heterogeneity Awareness in Gradient Tracking based Decentralized Bayesian Learning

Kinjal Bhar, He Bai, Jemin George, Carl Busart
Proceedings of the 7th Annual Learning for Dynamics \& Control Conference, PMLR 283:591-605, 2025.

Abstract

This paper proposes a Gradient Tracking Decentralized Unadjusted Langevin Algorithm (GT-DULA) to perform Bayesian learning via MCMC sampling. GT-DULA enhances the scalability of the process when compared with the conventional DULA as it reduces the dependence of the convergence bias on the network size by an order of magnitude for constant gradient step size. GT-DULA uses an estimate of the global gradient as a substitute for local gradients which is shared among neighbors in the network. Our theoretical analysis shows that the proposed GT-DULA successfully tracks the global gradient within a certain neighborhood, which leads to a two-fold benefit. First, the optimal mixing of the gradient estimates leads to a lower bias in convergence. Second, the successful tracking of the global gradient implies robustness towards data heterogeneity which is a major concern in decentralized learning.

Cite this Paper


BibTeX
@InProceedings{pmlr-v283-bhar25a, title = {Scalability Enhancement and Data-Heterogeneity Awareness in Gradient Tracking based Decentralized Bayesian Learning}, author = {Bhar, Kinjal and Bai, He and George, Jemin and Busart, Carl}, booktitle = {Proceedings of the 7th Annual Learning for Dynamics \& Control Conference}, pages = {591--605}, year = {2025}, editor = {Ozay, Necmiye and Balzano, Laura and Panagou, Dimitra and Abate, Alessandro}, volume = {283}, series = {Proceedings of Machine Learning Research}, month = {04--06 Jun}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v283/main/assets/bhar25a/bhar25a.pdf}, url = {https://proceedings.mlr.press/v283/bhar25a.html}, abstract = {This paper proposes a Gradient Tracking Decentralized Unadjusted Langevin Algorithm (GT-DULA) to perform Bayesian learning via MCMC sampling. GT-DULA enhances the scalability of the process when compared with the conventional DULA as it reduces the dependence of the convergence bias on the network size by an order of magnitude for constant gradient step size. GT-DULA uses an estimate of the global gradient as a substitute for local gradients which is shared among neighbors in the network. Our theoretical analysis shows that the proposed GT-DULA successfully tracks the global gradient within a certain neighborhood, which leads to a two-fold benefit. First, the optimal mixing of the gradient estimates leads to a lower bias in convergence. Second, the successful tracking of the global gradient implies robustness towards data heterogeneity which is a major concern in decentralized learning.} }
Endnote
%0 Conference Paper %T Scalability Enhancement and Data-Heterogeneity Awareness in Gradient Tracking based Decentralized Bayesian Learning %A Kinjal Bhar %A He Bai %A Jemin George %A Carl Busart %B Proceedings of the 7th Annual Learning for Dynamics \& Control Conference %C Proceedings of Machine Learning Research %D 2025 %E Necmiye Ozay %E Laura Balzano %E Dimitra Panagou %E Alessandro Abate %F pmlr-v283-bhar25a %I PMLR %P 591--605 %U https://proceedings.mlr.press/v283/bhar25a.html %V 283 %X This paper proposes a Gradient Tracking Decentralized Unadjusted Langevin Algorithm (GT-DULA) to perform Bayesian learning via MCMC sampling. GT-DULA enhances the scalability of the process when compared with the conventional DULA as it reduces the dependence of the convergence bias on the network size by an order of magnitude for constant gradient step size. GT-DULA uses an estimate of the global gradient as a substitute for local gradients which is shared among neighbors in the network. Our theoretical analysis shows that the proposed GT-DULA successfully tracks the global gradient within a certain neighborhood, which leads to a two-fold benefit. First, the optimal mixing of the gradient estimates leads to a lower bias in convergence. Second, the successful tracking of the global gradient implies robustness towards data heterogeneity which is a major concern in decentralized learning.
APA
Bhar, K., Bai, H., George, J. & Busart, C.. (2025). Scalability Enhancement and Data-Heterogeneity Awareness in Gradient Tracking based Decentralized Bayesian Learning. Proceedings of the 7th Annual Learning for Dynamics \& Control Conference, in Proceedings of Machine Learning Research 283:591-605 Available from https://proceedings.mlr.press/v283/bhar25a.html.

Related Material