Certifiably Robust Model Evaluation in Federated Learning under Meta-Distributional Shifts

Amir Najafi, Samin Mahdizadeh Sani, Farzan Farnia
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:45588-45623, 2025.

Abstract

We address the challenge of certifying the performance of a federated learning model on an unseen target network using only measurements from the source network that trained the model. Specifically, consider a source network "A" with $K$ clients, each holding private, non-IID datasets drawn from heterogeneous distributions, modeled as samples from a broader meta-distribution $\mu$. Our goal is to provide certified guarantees for the model’s performance on a different, unseen network "B", governed by an unknown meta-distribution $\mu’$, assuming the deviation between $\mu$ and $\mu’$ is bounded—either in Wasserstein distance or an $f$-divergence. We derive worst-case uniform guarantees for both the model’s average loss and its risk CDF, the latter corresponding to a novel, adversarially robust version of the Dvoretzky–Kiefer–Wolfowitz (DKW) inequality. In addition, we show how the vanilla DKW bound enables principled certification of the model’s true performance on unseen clients within the same (source) network. Our bounds are efficiently computable, asymptotically minimax optimal, and preserve clients’ privacy. We also establish non-asymptotic generalization bounds that converge to zero as $K$ grows and the minimum per-client sample size exceeds $\mathcal{O}(\log K)$. Empirical evaluations confirm the practical utility of our bounds across real-world tasks. The project code is available at: github.com/samin-mehdizadeh/Robust-Evaluation-DKW

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-najafi25a, title = {Certifiably Robust Model Evaluation in Federated Learning under Meta-Distributional Shifts}, author = {Najafi, Amir and Sani, Samin Mahdizadeh and Farnia, Farzan}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {45588--45623}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/najafi25a/najafi25a.pdf}, url = {https://proceedings.mlr.press/v267/najafi25a.html}, abstract = {We address the challenge of certifying the performance of a federated learning model on an unseen target network using only measurements from the source network that trained the model. Specifically, consider a source network "A" with $K$ clients, each holding private, non-IID datasets drawn from heterogeneous distributions, modeled as samples from a broader meta-distribution $\mu$. Our goal is to provide certified guarantees for the model’s performance on a different, unseen network "B", governed by an unknown meta-distribution $\mu’$, assuming the deviation between $\mu$ and $\mu’$ is bounded—either in Wasserstein distance or an $f$-divergence. We derive worst-case uniform guarantees for both the model’s average loss and its risk CDF, the latter corresponding to a novel, adversarially robust version of the Dvoretzky–Kiefer–Wolfowitz (DKW) inequality. In addition, we show how the vanilla DKW bound enables principled certification of the model’s true performance on unseen clients within the same (source) network. Our bounds are efficiently computable, asymptotically minimax optimal, and preserve clients’ privacy. We also establish non-asymptotic generalization bounds that converge to zero as $K$ grows and the minimum per-client sample size exceeds $\mathcal{O}(\log K)$. Empirical evaluations confirm the practical utility of our bounds across real-world tasks. The project code is available at: github.com/samin-mehdizadeh/Robust-Evaluation-DKW} }
Endnote
%0 Conference Paper %T Certifiably Robust Model Evaluation in Federated Learning under Meta-Distributional Shifts %A Amir Najafi %A Samin Mahdizadeh Sani %A Farzan Farnia %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-najafi25a %I PMLR %P 45588--45623 %U https://proceedings.mlr.press/v267/najafi25a.html %V 267 %X We address the challenge of certifying the performance of a federated learning model on an unseen target network using only measurements from the source network that trained the model. Specifically, consider a source network "A" with $K$ clients, each holding private, non-IID datasets drawn from heterogeneous distributions, modeled as samples from a broader meta-distribution $\mu$. Our goal is to provide certified guarantees for the model’s performance on a different, unseen network "B", governed by an unknown meta-distribution $\mu’$, assuming the deviation between $\mu$ and $\mu’$ is bounded—either in Wasserstein distance or an $f$-divergence. We derive worst-case uniform guarantees for both the model’s average loss and its risk CDF, the latter corresponding to a novel, adversarially robust version of the Dvoretzky–Kiefer–Wolfowitz (DKW) inequality. In addition, we show how the vanilla DKW bound enables principled certification of the model’s true performance on unseen clients within the same (source) network. Our bounds are efficiently computable, asymptotically minimax optimal, and preserve clients’ privacy. We also establish non-asymptotic generalization bounds that converge to zero as $K$ grows and the minimum per-client sample size exceeds $\mathcal{O}(\log K)$. Empirical evaluations confirm the practical utility of our bounds across real-world tasks. The project code is available at: github.com/samin-mehdizadeh/Robust-Evaluation-DKW
APA
Najafi, A., Sani, S.M. & Farnia, F.. (2025). Certifiably Robust Model Evaluation in Federated Learning under Meta-Distributional Shifts. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:45588-45623 Available from https://proceedings.mlr.press/v267/najafi25a.html.

Related Material