Heterogeneity for the Win: One-Shot Federated Clustering

Don Kurian Dennis, Tian Li, Virginia Smith
Proceedings of the 38th International Conference on Machine Learning, PMLR 139:2611-2620, 2021.

Abstract

In this work, we explore the unique challenges—and opportunities—of unsupervised federated learning (FL). We develop and analyze a one-shot federated clustering scheme, kfed, based on the widely-used Lloyd’s method for $k$-means clustering. In contrast to many supervised problems, we show that the issue of statistical heterogeneity in federated networks can in fact benefit our analysis. We analyse kfed under a center separation assumption and compare it to the best known requirements of its centralized counterpart. Our analysis shows that in heterogeneous regimes where the number of clusters per device $(k’)$ is smaller than the total number of clusters over the network $k$, $(k’\le \sqrt{k})$, we can use heterogeneity to our advantage—significantly weakening the cluster separation requirements for kfed. From a practical viewpoint, kfed also has many desirable properties: it requires only round of communication, can run asynchronously, and can handle partial participation or node/network failures. We motivate our analysis with experiments on common FL benchmarks, and highlight the practical utility of one-shot clustering through use-cases in personalized FL and device sampling.

Cite this Paper


BibTeX
@InProceedings{pmlr-v139-dennis21a, title = {Heterogeneity for the Win: One-Shot Federated Clustering}, author = {Dennis, Don Kurian and Li, Tian and Smith, Virginia}, booktitle = {Proceedings of the 38th International Conference on Machine Learning}, pages = {2611--2620}, year = {2021}, editor = {Meila, Marina and Zhang, Tong}, volume = {139}, series = {Proceedings of Machine Learning Research}, month = {18--24 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v139/dennis21a/dennis21a.pdf}, url = {https://proceedings.mlr.press/v139/dennis21a.html}, abstract = {In this work, we explore the unique challenges—and opportunities—of unsupervised federated learning (FL). We develop and analyze a one-shot federated clustering scheme, kfed, based on the widely-used Lloyd’s method for $k$-means clustering. In contrast to many supervised problems, we show that the issue of statistical heterogeneity in federated networks can in fact benefit our analysis. We analyse kfed under a center separation assumption and compare it to the best known requirements of its centralized counterpart. Our analysis shows that in heterogeneous regimes where the number of clusters per device $(k’)$ is smaller than the total number of clusters over the network $k$, $(k’\le \sqrt{k})$, we can use heterogeneity to our advantage—significantly weakening the cluster separation requirements for kfed. From a practical viewpoint, kfed also has many desirable properties: it requires only round of communication, can run asynchronously, and can handle partial participation or node/network failures. We motivate our analysis with experiments on common FL benchmarks, and highlight the practical utility of one-shot clustering through use-cases in personalized FL and device sampling.} }
Endnote
%0 Conference Paper %T Heterogeneity for the Win: One-Shot Federated Clustering %A Don Kurian Dennis %A Tian Li %A Virginia Smith %B Proceedings of the 38th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2021 %E Marina Meila %E Tong Zhang %F pmlr-v139-dennis21a %I PMLR %P 2611--2620 %U https://proceedings.mlr.press/v139/dennis21a.html %V 139 %X In this work, we explore the unique challenges—and opportunities—of unsupervised federated learning (FL). We develop and analyze a one-shot federated clustering scheme, kfed, based on the widely-used Lloyd’s method for $k$-means clustering. In contrast to many supervised problems, we show that the issue of statistical heterogeneity in federated networks can in fact benefit our analysis. We analyse kfed under a center separation assumption and compare it to the best known requirements of its centralized counterpart. Our analysis shows that in heterogeneous regimes where the number of clusters per device $(k’)$ is smaller than the total number of clusters over the network $k$, $(k’\le \sqrt{k})$, we can use heterogeneity to our advantage—significantly weakening the cluster separation requirements for kfed. From a practical viewpoint, kfed also has many desirable properties: it requires only round of communication, can run asynchronously, and can handle partial participation or node/network failures. We motivate our analysis with experiments on common FL benchmarks, and highlight the practical utility of one-shot clustering through use-cases in personalized FL and device sampling.
APA
Dennis, D.K., Li, T. & Smith, V.. (2021). Heterogeneity for the Win: One-Shot Federated Clustering. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:2611-2620 Available from https://proceedings.mlr.press/v139/dennis21a.html.

Related Material