Near-Optimal Convex Simple Bilevel Optimization with a Bisection Method

Jiulin Wang, Xu Shi, Rujun Jiang
Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:2008-2016, 2024.

Abstract

This paper studies a class of simple bilevel optimization problems where we minimize a composite convex function at the upper-level subject to a composite convex lower-level problem. Existing methods either provide asymptotic guarantees for the upper-level objective or attain slow sublinear convergence rates. We propose a bisection algorithm to find a solution that is $\epsilon_f$-optimal for the upper-level objective and $\epsilon_g$-optimal for the lower-level objective. In each iteration, the binary search narrows the interval by assessing inequality system feasibility. Under mild conditions, the total operation complexity of our method is ${{\mathcal{O}}}\left(\max\{\sqrt{L_{f_1}/\epsilon_f},\sqrt{L_{g_1}/\epsilon_g}\} \right)$. Here, a unit operation can be a function evaluation, gradient evaluation, or the invocation of the proximal mapping, $L_{f_1}$ and $L_{g_1}$ are the Lipschitz constants of the upper- and lower-level objectives’ smooth components, and ${\mathcal{O}}$ hides logarithmic terms. Our approach achieves a near-optimal rate in unconstrained smooth or composite convex optimization when disregarding logarithmic terms. Numerical experiments demonstrate the effectiveness of our method.

Cite this Paper


BibTeX
@InProceedings{pmlr-v238-wang24d, title = {Near-Optimal Convex Simple Bilevel Optimization with a Bisection Method}, author = {Wang, Jiulin and Shi, Xu and Jiang, Rujun}, booktitle = {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics}, pages = {2008--2016}, year = {2024}, editor = {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen}, volume = {238}, series = {Proceedings of Machine Learning Research}, month = {02--04 May}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v238/wang24d/wang24d.pdf}, url = {https://proceedings.mlr.press/v238/wang24d.html}, abstract = {This paper studies a class of simple bilevel optimization problems where we minimize a composite convex function at the upper-level subject to a composite convex lower-level problem. Existing methods either provide asymptotic guarantees for the upper-level objective or attain slow sublinear convergence rates. We propose a bisection algorithm to find a solution that is $\epsilon_f$-optimal for the upper-level objective and $\epsilon_g$-optimal for the lower-level objective. In each iteration, the binary search narrows the interval by assessing inequality system feasibility. Under mild conditions, the total operation complexity of our method is ${{\mathcal{O}}}\left(\max\{\sqrt{L_{f_1}/\epsilon_f},\sqrt{L_{g_1}/\epsilon_g}\} \right)$. Here, a unit operation can be a function evaluation, gradient evaluation, or the invocation of the proximal mapping, $L_{f_1}$ and $L_{g_1}$ are the Lipschitz constants of the upper- and lower-level objectives’ smooth components, and ${\mathcal{O}}$ hides logarithmic terms. Our approach achieves a near-optimal rate in unconstrained smooth or composite convex optimization when disregarding logarithmic terms. Numerical experiments demonstrate the effectiveness of our method.} }
Endnote
%0 Conference Paper %T Near-Optimal Convex Simple Bilevel Optimization with a Bisection Method %A Jiulin Wang %A Xu Shi %A Rujun Jiang %B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2024 %E Sanjoy Dasgupta %E Stephan Mandt %E Yingzhen Li %F pmlr-v238-wang24d %I PMLR %P 2008--2016 %U https://proceedings.mlr.press/v238/wang24d.html %V 238 %X This paper studies a class of simple bilevel optimization problems where we minimize a composite convex function at the upper-level subject to a composite convex lower-level problem. Existing methods either provide asymptotic guarantees for the upper-level objective or attain slow sublinear convergence rates. We propose a bisection algorithm to find a solution that is $\epsilon_f$-optimal for the upper-level objective and $\epsilon_g$-optimal for the lower-level objective. In each iteration, the binary search narrows the interval by assessing inequality system feasibility. Under mild conditions, the total operation complexity of our method is ${{\mathcal{O}}}\left(\max\{\sqrt{L_{f_1}/\epsilon_f},\sqrt{L_{g_1}/\epsilon_g}\} \right)$. Here, a unit operation can be a function evaluation, gradient evaluation, or the invocation of the proximal mapping, $L_{f_1}$ and $L_{g_1}$ are the Lipschitz constants of the upper- and lower-level objectives’ smooth components, and ${\mathcal{O}}$ hides logarithmic terms. Our approach achieves a near-optimal rate in unconstrained smooth or composite convex optimization when disregarding logarithmic terms. Numerical experiments demonstrate the effectiveness of our method.
APA
Wang, J., Shi, X. & Jiang, R.. (2024). Near-Optimal Convex Simple Bilevel Optimization with a Bisection Method. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:2008-2016 Available from https://proceedings.mlr.press/v238/wang24d.html.

Related Material