Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion

Ashok Cutkosky; Harsh Mehta; Francesco Orabona

Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion

Ashok Cutkosky, Harsh Mehta, Francesco Orabona

Proceedings of the 40th International Conference on Machine Learning, PMLR 202:6643-6670, 2023.

Abstract

We present new algorithms for optimizing non-smooth, non-convex stochastic objectives based on a novel analysis technique. This improves the current best-known complexity for finding a

$(\delta,\epsilon)$ -stationary point from

$O(\epsilon^{-4}\delta^{-1})$ stochastic gradient queries to

$O(\epsilon^{-3}\delta^{-1})$ , which we also show to be optimal. Our primary technique is a reduction from non-smooth non-convex optimization to online learning, after which our results follow from standard regret bounds in online learning. For deterministic and second-order smooth objectives, applying more advanced optimistic online learning techniques enables a new complexity of

$O(\epsilon^{-1.5}\delta^{-0.5})$ . Our improved non-smooth analysis also immediately recovers all optimal or best-known results for finding

$\epsilon$ stationary points of smooth or second-order smooth objectives in both stochastic and deterministic settings.

Cite this Paper

BibTeX


@InProceedings{pmlr-v202-cutkosky23a,
  title = 	 {Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion},
  author =       {Cutkosky, Ashok and Mehta, Harsh and Orabona, Francesco},
  booktitle = 	 {Proceedings of the 40th International Conference on Machine Learning},
  pages = 	 {6643--6670},
  year = 	 {2023},
  editor = 	 {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan},
  volume = 	 {202},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {23--29 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v202/cutkosky23a/cutkosky23a.pdf},
  url = 	 {https://proceedings.mlr.press/v202/cutkosky23a.html},
  abstract = 	 {We present new algorithms for optimizing non-smooth, non-convex stochastic objectives based on a novel analysis technique. This improves the current best-known complexity for finding a $(\delta,\epsilon)$-stationary point from $O(\epsilon^{-4}\delta^{-1})$ stochastic gradient queries to $O(\epsilon^{-3}\delta^{-1})$, which we also show to be optimal. Our primary technique is a reduction from non-smooth non-convex optimization to online learning, after which our results follow from standard regret bounds in online learning. For deterministic and second-order smooth objectives, applying more advanced optimistic online learning techniques enables a new complexity of $O(\epsilon^{-1.5}\delta^{-0.5})$. Our improved non-smooth analysis also immediately recovers all optimal or best-known results for finding $\epsilon$ stationary points of smooth or second-order smooth objectives in both stochastic and deterministic settings.}
}

Endnote

%0 Conference Paper
%T Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion
%A Ashok Cutkosky
%A Harsh Mehta
%A Francesco Orabona
%B Proceedings of the 40th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2023
%E Andreas Krause
%E Emma Brunskill
%E Kyunghyun Cho
%E Barbara Engelhardt
%E Sivan Sabato
%E Jonathan Scarlett	
%F pmlr-v202-cutkosky23a
%I PMLR
%P 6643--6670
%U https://proceedings.mlr.press/v202/cutkosky23a.html
%V 202
%X We present new algorithms for optimizing non-smooth, non-convex stochastic objectives based on a novel analysis technique. This improves the current best-known complexity for finding a $(\delta,\epsilon)$-stationary point from $O(\epsilon^{-4}\delta^{-1})$ stochastic gradient queries to $O(\epsilon^{-3}\delta^{-1})$, which we also show to be optimal. Our primary technique is a reduction from non-smooth non-convex optimization to online learning, after which our results follow from standard regret bounds in online learning. For deterministic and second-order smooth objectives, applying more advanced optimistic online learning techniques enables a new complexity of $O(\epsilon^{-1.5}\delta^{-0.5})$. Our improved non-smooth analysis also immediately recovers all optimal or best-known results for finding $\epsilon$ stationary points of smooth or second-order smooth objectives in both stochastic and deterministic settings.

APA


Cutkosky, A., Mehta, H. & Orabona, F.. (2023). Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:6643-6670 Available from https://proceedings.mlr.press/v202/cutkosky23a.html.

Optimal Stochastic Non-smooth Non-convex Optimization through Online-to-Non-convex Conversion

Abstract

Cite this Paper

Related Material