Online Nonconvex Optimization with Limited Instantaneous Oracle Feedback

Ziwei Guan; Yi Zhou; Yingbin Liang

Online Nonconvex Optimization with Limited Instantaneous Oracle Feedback

Ziwei Guan, Yi Zhou, Yingbin Liang

Proceedings of Thirty Sixth Conference on Learning Theory, PMLR 195:3328-3355, 2023.

Abstract

We investigate online nonconvex optimization from a local regret minimization perspective. Previous studies along this line implicitly required the access to sufficient gradient oracles at each time instance in order to design double-loop algorithms. In this work, we focus on more challenging but practical settings where only limited number of oracles are available in online nonconvex optimization, including window-smoothed single gradient oracle (Window-SGO), single function value oracle (Window-SVO) and multiple function value oracles (Window-MVO). Specifically, in the Window-SGO setting which allows only single-loop algorithm design, we derive a local regret lower bound, which indicates that single-loop algorithms are provably worse than double-loop algorithms. Further, the simple classical OGD algorithm achieves the window-unconditioned lower bound. Moreover, in the Window-SVO setting, we propose a novel single-loop online algorithm named SkipOGD, and show that it achieves a near-optimal local regret that matches the Window-SGO regret lower bound up to a factor of the dimension

$d$ due to the function value feedback. Lastly, in the Window-MVO setting, we propose a new double-loop online algorithm named LoopOGD and show that it achieves a smooth trade-off between regret minimization and sample complexity over the number of oracle calls

$K$ per time instance. In particular, with

$K=1$ and

$wd$ , LoopOGD respectively achieves our regret lower bound with Window-SGO (up to the factor

$d$ due to function value feedback) and the existing regret lower bound with multiple gradient oracle feedback.

Cite this Paper

BibTeX


@InProceedings{pmlr-v195-guan23a,
  title = 	 {Online Nonconvex Optimization with Limited Instantaneous Oracle Feedback},
  author =       {Guan, Ziwei and Zhou, Yi and Liang, Yingbin},
  booktitle = 	 {Proceedings of Thirty Sixth Conference on Learning Theory},
  pages = 	 {3328--3355},
  year = 	 {2023},
  editor = 	 {Neu, Gergely and Rosasco, Lorenzo},
  volume = 	 {195},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {12--15 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v195/guan23a/guan23a.pdf},
  url = 	 {https://proceedings.mlr.press/v195/guan23a.html},
  abstract = 	 {We investigate online nonconvex optimization from a local regret minimization perspective. Previous studies along this line implicitly required the access to sufficient gradient oracles at each time instance in order to design double-loop algorithms. In this work, we focus on more challenging but practical settings where only limited number of oracles are available in online nonconvex optimization, including window-smoothed single gradient oracle (Window-SGO), single function value oracle (Window-SVO) and multiple function value oracles (Window-MVO). Specifically, in the Window-SGO setting which allows only single-loop algorithm design, we derive a local regret lower bound, which indicates that single-loop algorithms are provably worse than double-loop algorithms. Further, the simple classical OGD algorithm achieves the window-unconditioned lower bound. Moreover, in the Window-SVO setting, we propose a novel single-loop online algorithm named SkipOGD, and show that it achieves a near-optimal local regret that matches the Window-SGO regret lower bound up to a factor of the dimension $d$ due to the function value feedback. Lastly, in the Window-MVO setting, we propose a new double-loop online algorithm named LoopOGD and show that it achieves a smooth trade-off between regret minimization and sample complexity over the number of oracle calls $K$ per time instance. In particular, with $K=1$ and $wd$, LoopOGD respectively achieves our regret lower bound with Window-SGO (up to the factor $d$ due to function value feedback) and the existing regret lower bound with multiple gradient oracle feedback. }
}

Endnote

%0 Conference Paper
%T Online Nonconvex Optimization with Limited Instantaneous Oracle Feedback
%A Ziwei Guan
%A Yi Zhou
%A Yingbin Liang
%B Proceedings of Thirty Sixth Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2023
%E Gergely Neu
%E Lorenzo Rosasco	
%F pmlr-v195-guan23a
%I PMLR
%P 3328--3355
%U https://proceedings.mlr.press/v195/guan23a.html
%V 195
%X We investigate online nonconvex optimization from a local regret minimization perspective. Previous studies along this line implicitly required the access to sufficient gradient oracles at each time instance in order to design double-loop algorithms. In this work, we focus on more challenging but practical settings where only limited number of oracles are available in online nonconvex optimization, including window-smoothed single gradient oracle (Window-SGO), single function value oracle (Window-SVO) and multiple function value oracles (Window-MVO). Specifically, in the Window-SGO setting which allows only single-loop algorithm design, we derive a local regret lower bound, which indicates that single-loop algorithms are provably worse than double-loop algorithms. Further, the simple classical OGD algorithm achieves the window-unconditioned lower bound. Moreover, in the Window-SVO setting, we propose a novel single-loop online algorithm named SkipOGD, and show that it achieves a near-optimal local regret that matches the Window-SGO regret lower bound up to a factor of the dimension $d$ due to the function value feedback. Lastly, in the Window-MVO setting, we propose a new double-loop online algorithm named LoopOGD and show that it achieves a smooth trade-off between regret minimization and sample complexity over the number of oracle calls $K$ per time instance. In particular, with $K=1$ and $wd$, LoopOGD respectively achieves our regret lower bound with Window-SGO (up to the factor $d$ due to function value feedback) and the existing regret lower bound with multiple gradient oracle feedback.

APA


Guan, Z., Zhou, Y. & Liang, Y.. (2023). Online Nonconvex Optimization with Limited Instantaneous Oracle Feedback. Proceedings of Thirty Sixth Conference on Learning Theory, in Proceedings of Machine Learning Research 195:3328-3355 Available from https://proceedings.mlr.press/v195/guan23a.html.

Related Material

Download PDF