Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond

Oliver Hinder; Aaron Sidford; Nimit Sohoni

Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond

Oliver Hinder, Aaron Sidford, Nimit Sohoni

Proceedings of Thirty Third Conference on Learning Theory, PMLR 125:1894-1938, 2020.

Abstract

In this paper, we provide near-optimal accelerated first-order methods for minimizing a broad class of smooth nonconvex functions that are unimodal on all lines through a minimizer. This function class, which we call the class of smooth quasar-convex functions, is parameterized by a constant

$\gamma \in (0,1]$ :

$\gamma = 1$ encompasses the classes of smooth convex and star-convex functions, and smaller values of

$\gamma$ indicate that the function can be "more nonconvex." We develop a variant of accelerated gradient descent that computes an

$\epsilon$ -approximate minimizer of a smooth

$\gamma$ -quasar-convex function with at most

$O(\gamma^{-1} \epsilon^{-1/2} \log(\gamma^{-1} \epsilon^{-1}))$ total function and gradient evaluations. We also derive a lower bound of

$\Omega(\gamma^{-1} \epsilon^{-1/2})$ on the worst-case number of gradient evaluations required by any deterministic first-order method, showing that, up to a logarithmic factor, no deterministic first-order method can improve upon ours.

Cite this Paper

BibTeX


@InProceedings{pmlr-v125-hinder20a,
  title = 	 {Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond},
  author =       {Hinder, Oliver and Sidford, Aaron and Sohoni, Nimit},
  booktitle = 	 {Proceedings of Thirty Third Conference on Learning Theory},
  pages = 	 {1894--1938},
  year = 	 {2020},
  editor = 	 {Abernethy, Jacob and Agarwal, Shivani},
  volume = 	 {125},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {09--12 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v125/hinder20a/hinder20a.pdf},
  url = 	 {https://proceedings.mlr.press/v125/hinder20a.html},
  abstract = 	 { In this paper, we provide near-optimal accelerated first-order methods for minimizing a broad class of smooth nonconvex functions that are unimodal on all lines through a minimizer. This function class, which we call the class of smooth quasar-convex functions, is parameterized by a constant $$\gamma \in (0,1]$$: $$\gamma = 1$$ encompasses the classes of smooth convex and star-convex functions, and smaller values of $$\gamma$$ indicate that the function can be "more nonconvex." We develop a variant of accelerated gradient descent that computes an $$\epsilon$$-approximate minimizer of a smooth $$\gamma$$-quasar-convex function with at most $$O(\gamma^{-1} \epsilon^{-1/2} \log(\gamma^{-1} \epsilon^{-1}))$$ total function and gradient evaluations. We also derive a lower bound of $$\Omega(\gamma^{-1} \epsilon^{-1/2})$$ on the worst-case number of gradient evaluations required by any deterministic first-order method, showing that, up to a logarithmic factor, no deterministic first-order method can improve upon ours.}
}

Endnote

%0 Conference Paper
%T Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond
%A Oliver Hinder
%A Aaron Sidford
%A Nimit Sohoni
%B Proceedings of Thirty Third Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2020
%E Jacob Abernethy
%E Shivani Agarwal	
%F pmlr-v125-hinder20a
%I PMLR
%P 1894--1938
%U https://proceedings.mlr.press/v125/hinder20a.html
%V 125
%X  In this paper, we provide near-optimal accelerated first-order methods for minimizing a broad class of smooth nonconvex functions that are unimodal on all lines through a minimizer. This function class, which we call the class of smooth quasar-convex functions, is parameterized by a constant $$\gamma \in (0,1]$$: $$\gamma = 1$$ encompasses the classes of smooth convex and star-convex functions, and smaller values of $$\gamma$$ indicate that the function can be "more nonconvex." We develop a variant of accelerated gradient descent that computes an $$\epsilon$$-approximate minimizer of a smooth $$\gamma$$-quasar-convex function with at most $$O(\gamma^{-1} \epsilon^{-1/2} \log(\gamma^{-1} \epsilon^{-1}))$$ total function and gradient evaluations. We also derive a lower bound of $$\Omega(\gamma^{-1} \epsilon^{-1/2})$$ on the worst-case number of gradient evaluations required by any deterministic first-order method, showing that, up to a logarithmic factor, no deterministic first-order method can improve upon ours.

APA


Hinder, O., Sidford, A. & Sohoni, N.. (2020). Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond. Proceedings of Thirty Third Conference on Learning Theory, in Proceedings of Machine Learning Research 125:1894-1938 Available from https://proceedings.mlr.press/v125/hinder20a.html.

Related Material

Download PDF