On The Complexity of First-Order Methods in Stochastic Bilevel Optimization

Jeongyeol Kwon, Dohyun Kwon, Hanbaek Lyu
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:25784-25811, 2024.

Abstract

We consider the problem of finding stationary points in Bilevel optimization when the lower-level problem is unconstrained and strongly convex. The problem has been extensively studied in recent years; the main technical challenge is to keep track of lower-level solutions y(x) in response to the changes in the upper-level variables x. Subsequently, all existing approaches tie their analyses to a genie algorithm that knows lower-level solutions and, therefore, need not query any points far from them. We consider a dual question to such approaches: suppose we have an oracle, which we call y-aware, that returns an O(ϵ)-estimate of the lower-level solution, in addition to first-order gradient estimators locally unbiased within the Θ(ϵ)-ball around y(x). We study the complexity of finding stationary points with such an y-aware oracle: we propose a simple first-order method that converges to an ϵ stationary point using O(ϵ6),O(ϵ4) access to first-order y-aware oracles. Our upper bounds also apply to standard unbiased first-order oracles, improving the best-known complexity of first-order methods by O(ϵ) with minimal assumptions. We then provide the matching Ω(ϵ6), Ω(ϵ4) lower bounds without and with an additional smoothness assumption, respectively. Our results imply that any approach that simulates an algorithm with an y-aware oracle must suffer the same lower bounds.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-kwon24b, title = {On The Complexity of First-Order Methods in Stochastic Bilevel Optimization}, author = {Kwon, Jeongyeol and Kwon, Dohyun and Lyu, Hanbaek}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {25784--25811}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/kwon24b/kwon24b.pdf}, url = {https://proceedings.mlr.press/v235/kwon24b.html}, abstract = {We consider the problem of finding stationary points in Bilevel optimization when the lower-level problem is unconstrained and strongly convex. The problem has been extensively studied in recent years; the main technical challenge is to keep track of lower-level solutions $y^*(x)$ in response to the changes in the upper-level variables $x$. Subsequently, all existing approaches tie their analyses to a genie algorithm that knows lower-level solutions and, therefore, need not query any points far from them. We consider a dual question to such approaches: suppose we have an oracle, which we call $y^*$-aware, that returns an $O(\epsilon)$-estimate of the lower-level solution, in addition to first-order gradient estimators locally unbiased within the $\Theta(\epsilon)$-ball around $y^*(x)$. We study the complexity of finding stationary points with such an $y^*$-aware oracle: we propose a simple first-order method that converges to an $\epsilon$ stationary point using $O(\epsilon^{-6}), O(\epsilon^{-4})$ access to first-order $y^*$-aware oracles. Our upper bounds also apply to standard unbiased first-order oracles, improving the best-known complexity of first-order methods by $O(\epsilon)$ with minimal assumptions. We then provide the matching $\Omega(\epsilon^{-6})$, $\Omega(\epsilon^{-4})$ lower bounds without and with an additional smoothness assumption, respectively. Our results imply that any approach that simulates an algorithm with an $y^*$-aware oracle must suffer the same lower bounds.} }
Endnote
%0 Conference Paper %T On The Complexity of First-Order Methods in Stochastic Bilevel Optimization %A Jeongyeol Kwon %A Dohyun Kwon %A Hanbaek Lyu %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-kwon24b %I PMLR %P 25784--25811 %U https://proceedings.mlr.press/v235/kwon24b.html %V 235 %X We consider the problem of finding stationary points in Bilevel optimization when the lower-level problem is unconstrained and strongly convex. The problem has been extensively studied in recent years; the main technical challenge is to keep track of lower-level solutions $y^*(x)$ in response to the changes in the upper-level variables $x$. Subsequently, all existing approaches tie their analyses to a genie algorithm that knows lower-level solutions and, therefore, need not query any points far from them. We consider a dual question to such approaches: suppose we have an oracle, which we call $y^*$-aware, that returns an $O(\epsilon)$-estimate of the lower-level solution, in addition to first-order gradient estimators locally unbiased within the $\Theta(\epsilon)$-ball around $y^*(x)$. We study the complexity of finding stationary points with such an $y^*$-aware oracle: we propose a simple first-order method that converges to an $\epsilon$ stationary point using $O(\epsilon^{-6}), O(\epsilon^{-4})$ access to first-order $y^*$-aware oracles. Our upper bounds also apply to standard unbiased first-order oracles, improving the best-known complexity of first-order methods by $O(\epsilon)$ with minimal assumptions. We then provide the matching $\Omega(\epsilon^{-6})$, $\Omega(\epsilon^{-4})$ lower bounds without and with an additional smoothness assumption, respectively. Our results imply that any approach that simulates an algorithm with an $y^*$-aware oracle must suffer the same lower bounds.
APA
Kwon, J., Kwon, D. & Lyu, H.. (2024). On The Complexity of First-Order Methods in Stochastic Bilevel Optimization. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:25784-25811 Available from https://proceedings.mlr.press/v235/kwon24b.html.

Related Material