Heterogeneity for the Win: One-Shot Federated Clustering

Don Kurian Dennis, Tian Li, Virginia Smith
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:2611-2620, 2021.

Abstract

In this work, we explore the unique challenges—and opportunities—of unsupervised federated learning (FL). We develop and analyze a one-shot federated clustering scheme, kfed, based on the widely-used Lloyd’s method for k-means clustering. In contrast to many supervised problems, we show that the issue of statistical heterogeneity in federated networks can in fact benefit our analysis. We analyse kfed under a center separation assumption and compare it to the best known requirements of its centralized counterpart. Our analysis shows that in heterogeneous regimes where the number of clusters per device (k) is smaller than the total number of clusters over the network k, (kk), we can use heterogeneity to our advantage—significantly weakening the cluster separation requirements for kfed. From a practical viewpoint, kfed also has many desirable properties: it requires only round of communication, can run asynchronously, and can handle partial participation or node/network failures. We motivate our analysis with experiments on common FL benchmarks, and highlight the practical utility of one-shot clustering through use-cases in personalized FL and device sampling.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-dennis21a, title = {Heterogeneity for the Win: One-Shot Federated Clustering}, author = {Dennis, Don Kurian and Li, Tian and Smith, Virginia}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {2611--2620}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/dennis21a/dennis21a.pdf}, url = {https://proceedings.mlr.press/v139/dennis21a.html}, abstract = {In this work, we explore the unique challenges—and opportunities—of unsupervised federated learning (FL). We develop and analyze a one-shot federated clustering scheme, kfed, based on the widely-used Lloyd’s method for $k$-means clustering. In contrast to many supervised problems, we show that the issue of statistical heterogeneity in federated networks can in fact benefit our analysis. We analyse kfed under a center separation assumption and compare it to the best known requirements of its centralized counterpart. Our analysis shows that in heterogeneous regimes where the number of clusters per device $(k’)$ is smaller than the total number of clusters over the network $k$, $(k’\le \sqrt{k})$, we can use heterogeneity to our advantage—significantly weakening the cluster separation requirements for kfed. From a practical viewpoint, kfed also has many desirable properties: it requires only round of communication, can run asynchronously, and can handle partial participation or node/network failures. We motivate our analysis with experiments on common FL benchmarks, and highlight the practical utility of one-shot clustering through use-cases in personalized FL and device sampling.} }
Endnote
%0 Conference Paper %T Heterogeneity for the Win: One-Shot Federated Clustering %A Don Kurian Dennis %A Tian Li %A Virginia Smith %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-dennis21a %I PMLR %P 2611--2620 %U https://proceedings.mlr.press/v139/dennis21a.html %V 139 %X In this work, we explore the unique challenges—and opportunities—of unsupervised federated learning (FL). We develop and analyze a one-shot federated clustering scheme, kfed, based on the widely-used Lloyd’s method for $k$-means clustering. In contrast to many supervised problems, we show that the issue of statistical heterogeneity in federated networks can in fact benefit our analysis. We analyse kfed under a center separation assumption and compare it to the best known requirements of its centralized counterpart. Our analysis shows that in heterogeneous regimes where the number of clusters per device $(k’)$ is smaller than the total number of clusters over the network $k$, $(k’\le \sqrt{k})$, we can use heterogeneity to our advantage—significantly weakening the cluster separation requirements for kfed. From a practical viewpoint, kfed also has many desirable properties: it requires only round of communication, can run asynchronously, and can handle partial participation or node/network failures. We motivate our analysis with experiments on common FL benchmarks, and highlight the practical utility of one-shot clustering through use-cases in personalized FL and device sampling.
APA
Dennis, D.K., Li, T. & Smith, V.. (2021). Heterogeneity for the Win: One-Shot Federated Clustering. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:2611-2620 Available from https://proceedings.mlr.press/v139/dennis21a.html.

Related Material