Federated Learning for Data Streams

Othmane Marfoq, Giovanni Neglia, Laetitia Kameni, Richard Vidal
Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, PMLR 206:8889-8924, 2023.

Abstract

Federated learning (FL) is an effective solution to train machine learning models on the increasing amount of data generated by IoT devices and smartphones while keeping such data localized. Most previous work on federated learning assumes that clients operate on static datasets collected before training starts. This approach may be inefficient because 1) it ignores new samples clients collect during training, and 2) it may require a potentially long preparatory phase for clients to collect enough data. Moreover, learning on static datasets may be simply impossible in scenarios with small aggregate storage across devices. It is, therefore, necessary to design federated algorithms able to learn from data streams. In this work, we formulate and study the problem of federated learning for data streams. We propose a general FL algorithm to learn from data streams through an opportune weighted empirical risk minimization. Our theoretical analysis provides insights to configure such an algorithm, and we evaluate its performance on a wide range of machine learning tasks.

Cite this Paper


BibTeX
@InProceedings{pmlr-v206-marfoq23a, title = {Federated Learning for Data Streams}, author = {Marfoq, Othmane and Neglia, Giovanni and Kameni, Laetitia and Vidal, Richard}, booktitle = {Proceedings of The 26th International Conference on Artificial Intelligence and Statistics}, pages = {8889--8924}, year = {2023}, editor = {Ruiz, Francisco and Dy, Jennifer and van de Meent, Jan-Willem}, volume = {206}, series = {Proceedings of Machine Learning Research}, month = {25--27 Apr}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v206/marfoq23a/marfoq23a.pdf}, url = {https://proceedings.mlr.press/v206/marfoq23a.html}, abstract = {Federated learning (FL) is an effective solution to train machine learning models on the increasing amount of data generated by IoT devices and smartphones while keeping such data localized. Most previous work on federated learning assumes that clients operate on static datasets collected before training starts. This approach may be inefficient because 1) it ignores new samples clients collect during training, and 2) it may require a potentially long preparatory phase for clients to collect enough data. Moreover, learning on static datasets may be simply impossible in scenarios with small aggregate storage across devices. It is, therefore, necessary to design federated algorithms able to learn from data streams. In this work, we formulate and study the problem of federated learning for data streams. We propose a general FL algorithm to learn from data streams through an opportune weighted empirical risk minimization. Our theoretical analysis provides insights to configure such an algorithm, and we evaluate its performance on a wide range of machine learning tasks.} }
Endnote
%0 Conference Paper %T Federated Learning for Data Streams %A Othmane Marfoq %A Giovanni Neglia %A Laetitia Kameni %A Richard Vidal %B Proceedings of The 26th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2023 %E Francisco Ruiz %E Jennifer Dy %E Jan-Willem van de Meent %F pmlr-v206-marfoq23a %I PMLR %P 8889--8924 %U https://proceedings.mlr.press/v206/marfoq23a.html %V 206 %X Federated learning (FL) is an effective solution to train machine learning models on the increasing amount of data generated by IoT devices and smartphones while keeping such data localized. Most previous work on federated learning assumes that clients operate on static datasets collected before training starts. This approach may be inefficient because 1) it ignores new samples clients collect during training, and 2) it may require a potentially long preparatory phase for clients to collect enough data. Moreover, learning on static datasets may be simply impossible in scenarios with small aggregate storage across devices. It is, therefore, necessary to design federated algorithms able to learn from data streams. In this work, we formulate and study the problem of federated learning for data streams. We propose a general FL algorithm to learn from data streams through an opportune weighted empirical risk minimization. Our theoretical analysis provides insights to configure such an algorithm, and we evaluate its performance on a wide range of machine learning tasks.
APA
Marfoq, O., Neglia, G., Kameni, L. & Vidal, R.. (2023). Federated Learning for Data Streams. Proceedings of The 26th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 206:8889-8924 Available from https://proceedings.mlr.press/v206/marfoq23a.html.

Related Material