[edit]
Faster Newton Methods for Convex and Nonconvex Optimization in Gradient Complexity
Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:1088-1112, 2026.
Abstract
Second-order optimization methods are computationally expensive for large-scale problems. Recently, Doikov, Chayti, and Jaggi (ICML 2023) proposed the LazyCRN method that reduces computation by studying the gradient complexity of second-order methods. Their method can achieve a gradient complexity of $\mathcal{O}( \bar d + \bar d^{1/2} \epsilon^{-3/2})$ and $\mathcal{O}( \bar d + \bar d^{1/2} \epsilon^{-1/2})$ for nonconvex and convex optimization, respectively, where $\bar d$ is the effective dimension and $\epsilon$ is the target precision. Very recently, Adil, Bullins, Sidford, and Zhang (NeurIPS 2025) improved the gradient complexity to $\mathcal{O}( \bar d + \bar d^{1/3} \epsilon^{-3/2} \ln^{18} \epsilon^{-1})$ for nonconvex optimization. However, the tightness of these methods remains open. In this work, we propose new methods that achieve an improved complexity of $\mathcal{O}( \bar d + \bar d^{1/3} \epsilon^{-3/2})$ and $\mathcal{O}( (\bar d + \bar d^{13/21} \epsilon^{-2/7}) \ln \bar d)$ for nonconvex and convex optimization, respectively, improving best-known results for both setups.