Online Nonconvex Optimization with Limited Instantaneous Oracle Feedback

Ziwei Guan, Yi Zhou, Yingbin Liang
Proceedings of Thirty Sixth Conference on Learning Theory, PMLR 195:3328-3355, 2023.

Abstract

We investigate online nonconvex optimization from a local regret minimization perspective. Previous studies along this line implicitly required the access to sufficient gradient oracles at each time instance in order to design double-loop algorithms. In this work, we focus on more challenging but practical settings where only limited number of oracles are available in online nonconvex optimization, including window-smoothed single gradient oracle (Window-SGO), single function value oracle (Window-SVO) and multiple function value oracles (Window-MVO). Specifically, in the Window-SGO setting which allows only single-loop algorithm design, we derive a local regret lower bound, which indicates that single-loop algorithms are provably worse than double-loop algorithms. Further, the simple classical OGD algorithm achieves the window-unconditioned lower bound. Moreover, in the Window-SVO setting, we propose a novel single-loop online algorithm named SkipOGD, and show that it achieves a near-optimal local regret that matches the Window-SGO regret lower bound up to a factor of the dimension $d$ due to the function value feedback. Lastly, in the Window-MVO setting, we propose a new double-loop online algorithm named LoopOGD and show that it achieves a smooth trade-off between regret minimization and sample complexity over the number of oracle calls $K$ per time instance. In particular, with $K=1$ and $wd$, LoopOGD respectively achieves our regret lower bound with Window-SGO (up to the factor $d$ due to function value feedback) and the existing regret lower bound with multiple gradient oracle feedback.

Cite this Paper


BibTeX
@InProceedings{pmlr-v195-guan23a, title = {Online Nonconvex Optimization with Limited Instantaneous Oracle Feedback}, author = {Guan, Ziwei and Zhou, Yi and Liang, Yingbin}, booktitle = {Proceedings of Thirty Sixth Conference on Learning Theory}, pages = {3328--3355}, year = {2023}, editor = {Neu, Gergely and Rosasco, Lorenzo}, volume = {195}, series = {Proceedings of Machine Learning Research}, month = {12--15 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v195/guan23a/guan23a.pdf}, url = {https://proceedings.mlr.press/v195/guan23a.html}, abstract = {We investigate online nonconvex optimization from a local regret minimization perspective. Previous studies along this line implicitly required the access to sufficient gradient oracles at each time instance in order to design double-loop algorithms. In this work, we focus on more challenging but practical settings where only limited number of oracles are available in online nonconvex optimization, including window-smoothed single gradient oracle (Window-SGO), single function value oracle (Window-SVO) and multiple function value oracles (Window-MVO). Specifically, in the Window-SGO setting which allows only single-loop algorithm design, we derive a local regret lower bound, which indicates that single-loop algorithms are provably worse than double-loop algorithms. Further, the simple classical OGD algorithm achieves the window-unconditioned lower bound. Moreover, in the Window-SVO setting, we propose a novel single-loop online algorithm named SkipOGD, and show that it achieves a near-optimal local regret that matches the Window-SGO regret lower bound up to a factor of the dimension $d$ due to the function value feedback. Lastly, in the Window-MVO setting, we propose a new double-loop online algorithm named LoopOGD and show that it achieves a smooth trade-off between regret minimization and sample complexity over the number of oracle calls $K$ per time instance. In particular, with $K=1$ and $wd$, LoopOGD respectively achieves our regret lower bound with Window-SGO (up to the factor $d$ due to function value feedback) and the existing regret lower bound with multiple gradient oracle feedback. } }
Endnote
%0 Conference Paper %T Online Nonconvex Optimization with Limited Instantaneous Oracle Feedback %A Ziwei Guan %A Yi Zhou %A Yingbin Liang %B Proceedings of Thirty Sixth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2023 %E Gergely Neu %E Lorenzo Rosasco %F pmlr-v195-guan23a %I PMLR %P 3328--3355 %U https://proceedings.mlr.press/v195/guan23a.html %V 195 %X We investigate online nonconvex optimization from a local regret minimization perspective. Previous studies along this line implicitly required the access to sufficient gradient oracles at each time instance in order to design double-loop algorithms. In this work, we focus on more challenging but practical settings where only limited number of oracles are available in online nonconvex optimization, including window-smoothed single gradient oracle (Window-SGO), single function value oracle (Window-SVO) and multiple function value oracles (Window-MVO). Specifically, in the Window-SGO setting which allows only single-loop algorithm design, we derive a local regret lower bound, which indicates that single-loop algorithms are provably worse than double-loop algorithms. Further, the simple classical OGD algorithm achieves the window-unconditioned lower bound. Moreover, in the Window-SVO setting, we propose a novel single-loop online algorithm named SkipOGD, and show that it achieves a near-optimal local regret that matches the Window-SGO regret lower bound up to a factor of the dimension $d$ due to the function value feedback. Lastly, in the Window-MVO setting, we propose a new double-loop online algorithm named LoopOGD and show that it achieves a smooth trade-off between regret minimization and sample complexity over the number of oracle calls $K$ per time instance. In particular, with $K=1$ and $wd$, LoopOGD respectively achieves our regret lower bound with Window-SGO (up to the factor $d$ due to function value feedback) and the existing regret lower bound with multiple gradient oracle feedback.
APA
Guan, Z., Zhou, Y. & Liang, Y.. (2023). Online Nonconvex Optimization with Limited Instantaneous Oracle Feedback. Proceedings of Thirty Sixth Conference on Learning Theory, in Proceedings of Machine Learning Research 195:3328-3355 Available from https://proceedings.mlr.press/v195/guan23a.html.

Related Material