Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond

Oliver Hinder, Aaron Sidford, Nimit Sohoni
Proceedings of Thirty Third Conference on Learning Theory, PMLR 125:1894-1938, 2020.

Abstract

In this paper, we provide near-optimal accelerated first-order methods for minimizing a broad class of smooth nonconvex functions that are unimodal on all lines through a minimizer. This function class, which we call the class of smooth quasar-convex functions, is parameterized by a constant $$\gamma \in (0,1]$$: $$\gamma = 1$$ encompasses the classes of smooth convex and star-convex functions, and smaller values of $$\gamma$$ indicate that the function can be "more nonconvex." We develop a variant of accelerated gradient descent that computes an $$\epsilon$$-approximate minimizer of a smooth $$\gamma$$-quasar-convex function with at most $$O(\gamma^{-1} \epsilon^{-1/2} \log(\gamma^{-1} \epsilon^{-1}))$$ total function and gradient evaluations. We also derive a lower bound of $$\Omega(\gamma^{-1} \epsilon^{-1/2})$$ on the worst-case number of gradient evaluations required by any deterministic first-order method, showing that, up to a logarithmic factor, no deterministic first-order method can improve upon ours.

Cite this Paper


BibTeX
@InProceedings{pmlr-v125-hinder20a, title = {Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond}, author = {Hinder, Oliver and Sidford, Aaron and Sohoni, Nimit}, booktitle = {Proceedings of Thirty Third Conference on Learning Theory}, pages = {1894--1938}, year = {2020}, editor = {Abernethy, Jacob and Agarwal, Shivani}, volume = {125}, series = {Proceedings of Machine Learning Research}, month = {09--12 Jul}, publisher = {PMLR}, pdf = {http://proceedings.mlr.press/v125/hinder20a/hinder20a.pdf}, url = {https://proceedings.mlr.press/v125/hinder20a.html}, abstract = { In this paper, we provide near-optimal accelerated first-order methods for minimizing a broad class of smooth nonconvex functions that are unimodal on all lines through a minimizer. This function class, which we call the class of smooth quasar-convex functions, is parameterized by a constant $$\gamma \in (0,1]$$: $$\gamma = 1$$ encompasses the classes of smooth convex and star-convex functions, and smaller values of $$\gamma$$ indicate that the function can be "more nonconvex." We develop a variant of accelerated gradient descent that computes an $$\epsilon$$-approximate minimizer of a smooth $$\gamma$$-quasar-convex function with at most $$O(\gamma^{-1} \epsilon^{-1/2} \log(\gamma^{-1} \epsilon^{-1}))$$ total function and gradient evaluations. We also derive a lower bound of $$\Omega(\gamma^{-1} \epsilon^{-1/2})$$ on the worst-case number of gradient evaluations required by any deterministic first-order method, showing that, up to a logarithmic factor, no deterministic first-order method can improve upon ours.} }
Endnote
%0 Conference Paper %T Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond %A Oliver Hinder %A Aaron Sidford %A Nimit Sohoni %B Proceedings of Thirty Third Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2020 %E Jacob Abernethy %E Shivani Agarwal %F pmlr-v125-hinder20a %I PMLR %P 1894--1938 %U https://proceedings.mlr.press/v125/hinder20a.html %V 125 %X In this paper, we provide near-optimal accelerated first-order methods for minimizing a broad class of smooth nonconvex functions that are unimodal on all lines through a minimizer. This function class, which we call the class of smooth quasar-convex functions, is parameterized by a constant $$\gamma \in (0,1]$$: $$\gamma = 1$$ encompasses the classes of smooth convex and star-convex functions, and smaller values of $$\gamma$$ indicate that the function can be "more nonconvex." We develop a variant of accelerated gradient descent that computes an $$\epsilon$$-approximate minimizer of a smooth $$\gamma$$-quasar-convex function with at most $$O(\gamma^{-1} \epsilon^{-1/2} \log(\gamma^{-1} \epsilon^{-1}))$$ total function and gradient evaluations. We also derive a lower bound of $$\Omega(\gamma^{-1} \epsilon^{-1/2})$$ on the worst-case number of gradient evaluations required by any deterministic first-order method, showing that, up to a logarithmic factor, no deterministic first-order method can improve upon ours.
APA
Hinder, O., Sidford, A. & Sohoni, N.. (2020). Near-Optimal Methods for Minimizing Star-Convex Functions and Beyond. Proceedings of Thirty Third Conference on Learning Theory, in Proceedings of Machine Learning Research 125:1894-1938 Available from https://proceedings.mlr.press/v125/hinder20a.html.

Related Material