New Lower Bounds for Non-Convex Stochastic Optimization through Divergence Decomposition

El Mehdi Saad, Wei-Cheng Lee, Francesco Orabona
Proceedings of Thirty Eighth Conference on Learning Theory, PMLR 291:5054-5107, 2025.

Abstract

We study fundamental limits of first\hyp order stochastic optimization in a range of non\hyp convex settings, including $L$-smooth functions satisfying Quasar\hyp Convexity ({QC}), Quadratic Growth ({QG}), and Restricted Secant Inequalities ({RSI}). While the convergence properties of standard algorithms are well understood in deterministic regimes, significantly fewer results address the stochastic case, where only unbiased and noisy gradients are available. We establish new lower bounds on the number of noisy gradient queries to minimize these classes of functions, also showing that they are tight (up to a logarithmic factor) in all the relevant quantities characterizing each class. Our approach reformulates the optimization task as a function identification problem, leveraging \textit{divergence decomposition} arguments to construct a challenging subclass that leads to sharp lower bounds. Furthermore, we present a specialized algorithm in the one\hyp dimensional setting that achieves faster rates, suggesting that certain dimensional thresholds are intrinsic to the complexity of non\hyp convex stochastic optimization.

Cite this Paper


BibTeX
@InProceedings{pmlr-v291-saad25a, title = {New Lower Bounds for Non-Convex Stochastic Optimization through Divergence Decomposition}, author = {Saad, El Mehdi and Lee, Wei-Cheng and Orabona, Francesco}, booktitle = {Proceedings of Thirty Eighth Conference on Learning Theory}, pages = {5054--5107}, year = {2025}, editor = {Haghtalab, Nika and Moitra, Ankur}, volume = {291}, series = {Proceedings of Machine Learning Research}, month = {30 Jun--04 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v291/main/assets/saad25a/saad25a.pdf}, url = {https://proceedings.mlr.press/v291/saad25a.html}, abstract = {We study fundamental limits of first\hyp order stochastic optimization in a range of non\hyp convex settings, including $L$-smooth functions satisfying Quasar\hyp Convexity ({QC}), Quadratic Growth ({QG}), and Restricted Secant Inequalities ({RSI}). While the convergence properties of standard algorithms are well understood in deterministic regimes, significantly fewer results address the stochastic case, where only unbiased and noisy gradients are available. We establish new lower bounds on the number of noisy gradient queries to minimize these classes of functions, also showing that they are tight (up to a logarithmic factor) in all the relevant quantities characterizing each class. Our approach reformulates the optimization task as a function identification problem, leveraging \textit{divergence decomposition} arguments to construct a challenging subclass that leads to sharp lower bounds. Furthermore, we present a specialized algorithm in the one\hyp dimensional setting that achieves faster rates, suggesting that certain dimensional thresholds are intrinsic to the complexity of non\hyp convex stochastic optimization.} }
Endnote
%0 Conference Paper %T New Lower Bounds for Non-Convex Stochastic Optimization through Divergence Decomposition %A El Mehdi Saad %A Wei-Cheng Lee %A Francesco Orabona %B Proceedings of Thirty Eighth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2025 %E Nika Haghtalab %E Ankur Moitra %F pmlr-v291-saad25a %I PMLR %P 5054--5107 %U https://proceedings.mlr.press/v291/saad25a.html %V 291 %X We study fundamental limits of first\hyp order stochastic optimization in a range of non\hyp convex settings, including $L$-smooth functions satisfying Quasar\hyp Convexity ({QC}), Quadratic Growth ({QG}), and Restricted Secant Inequalities ({RSI}). While the convergence properties of standard algorithms are well understood in deterministic regimes, significantly fewer results address the stochastic case, where only unbiased and noisy gradients are available. We establish new lower bounds on the number of noisy gradient queries to minimize these classes of functions, also showing that they are tight (up to a logarithmic factor) in all the relevant quantities characterizing each class. Our approach reformulates the optimization task as a function identification problem, leveraging \textit{divergence decomposition} arguments to construct a challenging subclass that leads to sharp lower bounds. Furthermore, we present a specialized algorithm in the one\hyp dimensional setting that achieves faster rates, suggesting that certain dimensional thresholds are intrinsic to the complexity of non\hyp convex stochastic optimization.
APA
Saad, E.M., Lee, W. & Orabona, F.. (2025). New Lower Bounds for Non-Convex Stochastic Optimization through Divergence Decomposition. Proceedings of Thirty Eighth Conference on Learning Theory, in Proceedings of Machine Learning Research 291:5054-5107 Available from https://proceedings.mlr.press/v291/saad25a.html.

Related Material