Minimax Excess Risk of First-Order Methods for Statistical Learning with Data-Dependent Oracles

Kevin Scaman, Mathieu Even, Batiste Le Bars, Laurent Massoulie
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:3709-3717, 2024.

Abstract

In this paper, our aim is to analyse the generalization capabilities of first-order methods for statistical learning in multiple, different yet related, scenarios including supervised learning, transfer learning, robust learning and federated learning. To do so, we provide sharp upper and lower bounds for the minimax excess risk of strongly convex and smooth statistical learning when the gradient is accessed through partial observations given by a data-dependent oracle. This novel class of oracles can query the gradient with any given data distribution, and is thus well suited to scenarios in which the training data distribution does not match the target (or test) distribution. In particular, our upper and lower bounds are proportional to the smallest mean square error achievable by gradient estimators, thus allowing us to easily derive multiple sharp bounds in the aforementioned scenarios using the extensive literature on parameter estimation.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-scaman24a, title = { Minimax Excess Risk of First-Order Methods for Statistical Learning with Data-Dependent Oracles }, author = {Scaman, Kevin and Even, Mathieu and Le Bars, Batiste and Massoulie, Laurent}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {3709--3717}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/scaman24a/scaman24a.pdf}, url = {https://proceedings.mlr.press/v238/scaman24a.html}, abstract = { In this paper, our aim is to analyse the generalization capabilities of first-order methods for statistical learning in multiple, different yet related, scenarios including supervised learning, transfer learning, robust learning and federated learning. To do so, we provide sharp upper and lower bounds for the minimax excess risk of strongly convex and smooth statistical learning when the gradient is accessed through partial observations given by a data-dependent oracle. This novel class of oracles can query the gradient with any given data distribution, and is thus well suited to scenarios in which the training data distribution does not match the target (or test) distribution. In particular, our upper and lower bounds are proportional to the smallest mean square error achievable by gradient estimators, thus allowing us to easily derive multiple sharp bounds in the aforementioned scenarios using the extensive literature on parameter estimation. } }
Endnote
%0 Conference Paper %T Minimax Excess Risk of First-Order Methods for Statistical Learning with Data-Dependent Oracles %A Kevin Scaman %A Mathieu Even %A Batiste Le Bars %A Laurent Massoulie %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-scaman24a %I PMLR %P 3709--3717 %U https://proceedings.mlr.press/v238/scaman24a.html %V 238 %X In this paper, our aim is to analyse the generalization capabilities of first-order methods for statistical learning in multiple, different yet related, scenarios including supervised learning, transfer learning, robust learning and federated learning. To do so, we provide sharp upper and lower bounds for the minimax excess risk of strongly convex and smooth statistical learning when the gradient is accessed through partial observations given by a data-dependent oracle. This novel class of oracles can query the gradient with any given data distribution, and is thus well suited to scenarios in which the training data distribution does not match the target (or test) distribution. In particular, our upper and lower bounds are proportional to the smallest mean square error achievable by gradient estimators, thus allowing us to easily derive multiple sharp bounds in the aforementioned scenarios using the extensive literature on parameter estimation.
APA
Scaman, K., Even, M., Le Bars, B. & Massoulie, L.. (2024). Minimax Excess Risk of First-Order Methods for Statistical Learning with Data-Dependent Oracles . Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:3709-3717 Available from https://proceedings.mlr.press/v238/scaman24a.html.

Related Material