Adaptive Compression in Federated Learning via Side Information

Berivan Isik, Francesco Pase, Deniz Gunduz, Sanmi Koyejo, Tsachy Weissman, Michele Zorzi
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:487-495, 2024.

Abstract

The high communication cost of sending model updates from the clients to the server is a significant bottleneck for scalable federated learning (FL). Among existing approaches, state-of-the-art bitrate-accuracy tradeoffs have been achieved using stochastic compression methods – in which the client n sends a sample from a client-only probability distribution $q_{\phi^{(n)}}$, and the server estimates the mean of the clients’ distributions using these samples. However, such methods do not take full advantage of the FL setup where the server, throughout the training process, has side information in the form of a global distribution $p_{\theta}$ that is close to the client-only distribution $q_{\phi^{(n)}}$ in Kullback-Leibler (KL) divergence. In this work, we exploit this \emph{closeness} between the clients’ distributions $q_{\phi^{(n)}}$’s and the side information $p_{\theta}$ at the server, and propose a framework that requires approximately $D_{KL}(q_{\phi^{(n)}}|| p_{\theta})$ bits of communication. We show that our method can be integrated into many existing stochastic compression frameworks to attain the same (and often higher) test accuracy with up to 82 times smaller bitrate than the prior work – corresponding to 2,650 times overall compression.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-isik24a, title = { Adaptive Compression in Federated Learning via Side Information }, author = {Isik, Berivan and Pase, Francesco and Gunduz, Deniz and Koyejo, Sanmi and Weissman, Tsachy and Zorzi, Michele}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {487--495}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/isik24a/isik24a.pdf}, url = {https://proceedings.mlr.press/v238/isik24a.html}, abstract = { The high communication cost of sending model updates from the clients to the server is a significant bottleneck for scalable federated learning (FL). Among existing approaches, state-of-the-art bitrate-accuracy tradeoffs have been achieved using stochastic compression methods – in which the client n sends a sample from a client-only probability distribution $q_{\phi^{(n)}}$, and the server estimates the mean of the clients’ distributions using these samples. However, such methods do not take full advantage of the FL setup where the server, throughout the training process, has side information in the form of a global distribution $p_{\theta}$ that is close to the client-only distribution $q_{\phi^{(n)}}$ in Kullback-Leibler (KL) divergence. In this work, we exploit this \emph{closeness} between the clients’ distributions $q_{\phi^{(n)}}$’s and the side information $p_{\theta}$ at the server, and propose a framework that requires approximately $D_{KL}(q_{\phi^{(n)}}|| p_{\theta})$ bits of communication. We show that our method can be integrated into many existing stochastic compression frameworks to attain the same (and often higher) test accuracy with up to 82 times smaller bitrate than the prior work – corresponding to 2,650 times overall compression. } }
Endnote
%0 Conference Paper %T Adaptive Compression in Federated Learning via Side Information %A Berivan Isik %A Francesco Pase %A Deniz Gunduz %A Sanmi Koyejo %A Tsachy Weissman %A Michele Zorzi %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-isik24a %I PMLR %P 487--495 %U https://proceedings.mlr.press/v238/isik24a.html %V 238 %X The high communication cost of sending model updates from the clients to the server is a significant bottleneck for scalable federated learning (FL). Among existing approaches, state-of-the-art bitrate-accuracy tradeoffs have been achieved using stochastic compression methods – in which the client n sends a sample from a client-only probability distribution $q_{\phi^{(n)}}$, and the server estimates the mean of the clients’ distributions using these samples. However, such methods do not take full advantage of the FL setup where the server, throughout the training process, has side information in the form of a global distribution $p_{\theta}$ that is close to the client-only distribution $q_{\phi^{(n)}}$ in Kullback-Leibler (KL) divergence. In this work, we exploit this \emph{closeness} between the clients’ distributions $q_{\phi^{(n)}}$’s and the side information $p_{\theta}$ at the server, and propose a framework that requires approximately $D_{KL}(q_{\phi^{(n)}}|| p_{\theta})$ bits of communication. We show that our method can be integrated into many existing stochastic compression frameworks to attain the same (and often higher) test accuracy with up to 82 times smaller bitrate than the prior work – corresponding to 2,650 times overall compression.
APA
Isik, B., Pase, F., Gunduz, D., Koyejo, S., Weissman, T. & Zorzi, M.. (2024). Adaptive Compression in Federated Learning via Side Information . Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:487-495 Available from https://proceedings.mlr.press/v238/isik24a.html.

Related Material