Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion

Ashok Cutkosky, Harsh Mehta, Francesco Orabona
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:6643-6670, 2023.

Abstract

We present new algorithms for optimizing non-smooth, non-convex stochastic objectives based on a novel analysis technique. This improves the current best-known complexity for finding a $(\delta,\epsilon)$-stationary point from $O(\epsilon^{-4}\delta^{-1})$ stochastic gradient queries to $O(\epsilon^{-3}\delta^{-1})$, which we also show to be optimal. Our primary technique is a reduction from non-smooth non-convex optimization to online learning, after which our results follow from standard regret bounds in online learning. For deterministic and second-order smooth objectives, applying more advanced optimistic online learning techniques enables a new complexity of $O(\epsilon^{-1.5}\delta^{-0.5})$. Our improved non-smooth analysis also immediately recovers all optimal or best-known results for finding $\epsilon$ stationary points of smooth or second-order smooth objectives in both stochastic and deterministic settings.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-cutkosky23a, title = {Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion}, author = {Cutkosky, Ashok and Mehta, Harsh and Orabona, Francesco}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {6643--6670}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/cutkosky23a/cutkosky23a.pdf}, url = {https://proceedings.mlr.press/v202/cutkosky23a.html}, abstract = {We present new algorithms for optimizing non-smooth, non-convex stochastic objectives based on a novel analysis technique. This improves the current best-known complexity for finding a $(\delta,\epsilon)$-stationary point from $O(\epsilon^{-4}\delta^{-1})$ stochastic gradient queries to $O(\epsilon^{-3}\delta^{-1})$, which we also show to be optimal. Our primary technique is a reduction from non-smooth non-convex optimization to online learning, after which our results follow from standard regret bounds in online learning. For deterministic and second-order smooth objectives, applying more advanced optimistic online learning techniques enables a new complexity of $O(\epsilon^{-1.5}\delta^{-0.5})$. Our improved non-smooth analysis also immediately recovers all optimal or best-known results for finding $\epsilon$ stationary points of smooth or second-order smooth objectives in both stochastic and deterministic settings.} }
Endnote
%0 Conference Paper %T Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion %A Ashok Cutkosky %A Harsh Mehta %A Francesco Orabona %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-cutkosky23a %I PMLR %P 6643--6670 %U https://proceedings.mlr.press/v202/cutkosky23a.html %V 202 %X We present new algorithms for optimizing non-smooth, non-convex stochastic objectives based on a novel analysis technique. This improves the current best-known complexity for finding a $(\delta,\epsilon)$-stationary point from $O(\epsilon^{-4}\delta^{-1})$ stochastic gradient queries to $O(\epsilon^{-3}\delta^{-1})$, which we also show to be optimal. Our primary technique is a reduction from non-smooth non-convex optimization to online learning, after which our results follow from standard regret bounds in online learning. For deterministic and second-order smooth objectives, applying more advanced optimistic online learning techniques enables a new complexity of $O(\epsilon^{-1.5}\delta^{-0.5})$. Our improved non-smooth analysis also immediately recovers all optimal or best-known results for finding $\epsilon$ stationary points of smooth or second-order smooth objectives in both stochastic and deterministic settings.
APA
Cutkosky, A., Mehta, H. & Orabona, F.. (2023). Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:6643-6670 Available from https://proceedings.mlr.press/v202/cutkosky23a.html.

Related Material