Clustered Sampling: Low-Variance and Improved Representativity for Clients Selection in Federated Learning

Yann Fraboni, Richard Vidal, Laetitia Kameni, Marco Lorenzi
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:3407-3416, 2021.

Abstract

This work addresses the problem of optimizing communications between server and clients in federated learning (FL). Current sampling approaches in FL are either biased, or non optimal in terms of server-clients communications and training stability. To overcome this issue, we introduce clustered sampling for clients selection. We prove that clustered sampling leads to better clients representatitivity and to reduced variance of the clients stochastic aggregation weights in FL. Compatibly with our theory, we provide two different clustering approaches enabling clients aggregation based on 1) sample size, and 2) models similarity. Through a series of experiments in non-iid and unbalanced scenarios, we demonstrate that model aggregation through clustered sampling consistently leads to better training convergence and variability when compared to standard sampling approaches. Our approach does not require any additional operation on the clients side, and can be seamlessly integrated in standard FL implementations. Finally, clustered sampling is compatible with existing methods and technologies for privacy enhancement, and for communication reduction through model compression.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-fraboni21a, title = {Clustered Sampling: Low-Variance and Improved Representativity for Clients Selection in Federated Learning}, author = {Fraboni, Yann and Vidal, Richard and Kameni, Laetitia and Lorenzi, Marco}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {3407--3416}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/fraboni21a/fraboni21a.pdf}, url = {https://proceedings.mlr.press/v139/fraboni21a.html}, abstract = {This work addresses the problem of optimizing communications between server and clients in federated learning (FL). Current sampling approaches in FL are either biased, or non optimal in terms of server-clients communications and training stability. To overcome this issue, we introduce clustered sampling for clients selection. We prove that clustered sampling leads to better clients representatitivity and to reduced variance of the clients stochastic aggregation weights in FL. Compatibly with our theory, we provide two different clustering approaches enabling clients aggregation based on 1) sample size, and 2) models similarity. Through a series of experiments in non-iid and unbalanced scenarios, we demonstrate that model aggregation through clustered sampling consistently leads to better training convergence and variability when compared to standard sampling approaches. Our approach does not require any additional operation on the clients side, and can be seamlessly integrated in standard FL implementations. Finally, clustered sampling is compatible with existing methods and technologies for privacy enhancement, and for communication reduction through model compression.} }
Endnote
%0 Conference Paper %T Clustered Sampling: Low-Variance and Improved Representativity for Clients Selection in Federated Learning %A Yann Fraboni %A Richard Vidal %A Laetitia Kameni %A Marco Lorenzi %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-fraboni21a %I PMLR %P 3407--3416 %U https://proceedings.mlr.press/v139/fraboni21a.html %V 139 %X This work addresses the problem of optimizing communications between server and clients in federated learning (FL). Current sampling approaches in FL are either biased, or non optimal in terms of server-clients communications and training stability. To overcome this issue, we introduce clustered sampling for clients selection. We prove that clustered sampling leads to better clients representatitivity and to reduced variance of the clients stochastic aggregation weights in FL. Compatibly with our theory, we provide two different clustering approaches enabling clients aggregation based on 1) sample size, and 2) models similarity. Through a series of experiments in non-iid and unbalanced scenarios, we demonstrate that model aggregation through clustered sampling consistently leads to better training convergence and variability when compared to standard sampling approaches. Our approach does not require any additional operation on the clients side, and can be seamlessly integrated in standard FL implementations. Finally, clustered sampling is compatible with existing methods and technologies for privacy enhancement, and for communication reduction through model compression.
APA
Fraboni, Y., Vidal, R., Kameni, L. & Lorenzi, M.. (2021). Clustered Sampling: Low-Variance and Improved Representativity for Clients Selection in Federated Learning. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:3407-3416 Available from https://proceedings.mlr.press/v139/fraboni21a.html.

Related Material