A Lower Bound and a Near-Optimal Algorithm for Bilevel Empirical Risk Minimization

Mathieu Dagréou, Thomas Moreau, Samuel Vaiter, Pierre Ablin
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:82-90, 2024.

Abstract

Bilevel optimization problems, which are problems where two optimization problems are nested, have more and more applications in machine learning. In many practical cases, the upper and the lower objectives correspond to empirical risk minimization problems and therefore have a sum structure. In this context, we propose a bilevel extension of the celebrated SARAH algorithm. We demonstrate that the algorithm requires $O((n+m)^{1/2}\epsilon^{-1})$ oracle calls to achieve $\epsilon$-stationarity with $n+m$ the total number of samples, which improves over all previous bilevel algorithms. Moreover, we provide a lower bound on the number of oracle calls required to get an approximate stationary point of the objective function of the bilevel problem. This lower bound is attained by our algorithm, making it optimal in terms of sample complexity.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-dagreou24a, title = {A Lower Bound and a Near-Optimal Algorithm for Bilevel Empirical Risk Minimization}, author = {Dagr\'{e}ou, Mathieu and Moreau, Thomas and Vaiter, Samuel and Ablin, Pierre}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {82--90}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/dagreou24a/dagreou24a.pdf}, url = {https://proceedings.mlr.press/v238/dagreou24a.html}, abstract = {Bilevel optimization problems, which are problems where two optimization problems are nested, have more and more applications in machine learning. In many practical cases, the upper and the lower objectives correspond to empirical risk minimization problems and therefore have a sum structure. In this context, we propose a bilevel extension of the celebrated SARAH algorithm. We demonstrate that the algorithm requires $O((n+m)^{1/2}\epsilon^{-1})$ oracle calls to achieve $\epsilon$-stationarity with $n+m$ the total number of samples, which improves over all previous bilevel algorithms. Moreover, we provide a lower bound on the number of oracle calls required to get an approximate stationary point of the objective function of the bilevel problem. This lower bound is attained by our algorithm, making it optimal in terms of sample complexity.} }
Endnote
%0 Conference Paper %T A Lower Bound and a Near-Optimal Algorithm for Bilevel Empirical Risk Minimization %A Mathieu Dagréou %A Thomas Moreau %A Samuel Vaiter %A Pierre Ablin %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-dagreou24a %I PMLR %P 82--90 %U https://proceedings.mlr.press/v238/dagreou24a.html %V 238 %X Bilevel optimization problems, which are problems where two optimization problems are nested, have more and more applications in machine learning. In many practical cases, the upper and the lower objectives correspond to empirical risk minimization problems and therefore have a sum structure. In this context, we propose a bilevel extension of the celebrated SARAH algorithm. We demonstrate that the algorithm requires $O((n+m)^{1/2}\epsilon^{-1})$ oracle calls to achieve $\epsilon$-stationarity with $n+m$ the total number of samples, which improves over all previous bilevel algorithms. Moreover, we provide a lower bound on the number of oracle calls required to get an approximate stationary point of the objective function of the bilevel problem. This lower bound is attained by our algorithm, making it optimal in terms of sample complexity.
APA
Dagréou, M., Moreau, T., Vaiter, S. & Ablin, P.. (2024). A Lower Bound and a Near-Optimal Algorithm for Bilevel Empirical Risk Minimization. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:82-90 Available from https://proceedings.mlr.press/v238/dagreou24a.html.

Related Material