Faster Newton Methods for Convex and Nonconvex Optimization in Gradient Complexity

Lesi Chen; Chengchang Liu; Luo Luo; Jingzhao Zhang

Faster Newton Methods for Convex and Nonconvex Optimization in Gradient Complexity

Lesi Chen, Chengchang Liu, Luo Luo, Jingzhao Zhang

Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:1088-1112, 2026.

Abstract

Second-order optimization methods are computationally expensive for large-scale problems. Recently, Doikov, Chayti, and Jaggi (ICML 2023) proposed the LazyCRN method that reduces computation by studying the gradient complexity of second-order methods. Their method can achieve a gradient complexity of $\mathcal{O}( \bar d + \bar d^{1/2} \epsilon^{-3/2})$ and $\mathcal{O}( \bar d + \bar d^{1/2} \epsilon^{-1/2})$ for nonconvex and convex optimization, respectively, where $\bar d$ is the effective dimension and $\epsilon$ is the target precision. Very recently, Adil, Bullins, Sidford, and Zhang (NeurIPS 2025) improved the gradient complexity to $\mathcal{O}( \bar d + \bar d^{1/3} \epsilon^{-3/2} \ln^{18} \epsilon^{-1})$ for nonconvex optimization. However, the tightness of these methods remains open. In this work, we propose new methods that achieve an improved complexity of $\mathcal{O}( \bar d + \bar d^{1/3} \epsilon^{-3/2})$ and $\mathcal{O}( (\bar d + \bar d^{13/21} \epsilon^{-2/7}) \ln \bar d)$ for nonconvex and convex optimization, respectively, improving best-known results for both setups.

Cite this Paper

BibTeX

@InProceedings{pmlr-v336-chen26a,
  title = 	 {Faster Newton Methods for Convex and Nonconvex Optimization in Gradient Complexity},
  author =       {Chen, Lesi and Liu, Chengchang and Luo, Luo and Zhang, Jingzhao},
  booktitle = 	 {Proceedings of Thirty Ninth Conference on Learning Theory},
  pages = 	 {1088--1112},
  year = 	 {2026},
  editor = 	 {Hanneke, Steve and Lattimore, Tor},
  volume = 	 {336},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {29 Jun--03 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v336/main/assets/chen26a/chen26a.pdf},
  url = 	 {https://proceedings.mlr.press/v336/chen26a.html},
  abstract = 	 {Second-order optimization methods are computationally expensive for large-scale problems. Recently, Doikov, Chayti, and Jaggi (ICML 2023) proposed the LazyCRN method that reduces computation by studying the gradient complexity of second-order methods. Their method can achieve a gradient complexity of $\mathcal{O}( \bar d + \bar d^{1/2} \epsilon^{-3/2})$ and $\mathcal{O}( \bar d + \bar d^{1/2} \epsilon^{-1/2})$ for nonconvex and convex optimization, respectively, where $\bar d$ is the effective dimension and $\epsilon$ is the target precision. Very recently, Adil, Bullins, Sidford, and Zhang (NeurIPS 2025) improved the gradient complexity to $\mathcal{O}( \bar d + \bar d^{1/3} \epsilon^{-3/2} \ln^{18} \epsilon^{-1})$ for nonconvex optimization. However, the tightness of these methods remains open. In this work, we propose new methods that achieve an improved complexity of $\mathcal{O}( \bar d + \bar d^{1/3} \epsilon^{-3/2})$ and $\mathcal{O}( (\bar d + \bar d^{13/21} \epsilon^{-2/7}) \ln \bar d)$ for nonconvex and convex optimization, respectively, improving best-known results for both setups.}
}

Endnote

%0 Conference Paper
%T Faster Newton Methods for Convex and Nonconvex Optimization in Gradient Complexity
%A Lesi Chen
%A Chengchang Liu
%A Luo Luo
%A Jingzhao Zhang
%B Proceedings of Thirty Ninth Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2026
%E Steve Hanneke
%E Tor Lattimore	
%F pmlr-v336-chen26a
%I PMLR
%P 1088--1112
%U https://proceedings.mlr.press/v336/chen26a.html
%V 336
%X Second-order optimization methods are computationally expensive for large-scale problems. Recently, Doikov, Chayti, and Jaggi (ICML 2023) proposed the LazyCRN method that reduces computation by studying the gradient complexity of second-order methods. Their method can achieve a gradient complexity of $\mathcal{O}( \bar d + \bar d^{1/2} \epsilon^{-3/2})$ and $\mathcal{O}( \bar d + \bar d^{1/2} \epsilon^{-1/2})$ for nonconvex and convex optimization, respectively, where $\bar d$ is the effective dimension and $\epsilon$ is the target precision. Very recently, Adil, Bullins, Sidford, and Zhang (NeurIPS 2025) improved the gradient complexity to $\mathcal{O}( \bar d + \bar d^{1/3} \epsilon^{-3/2} \ln^{18} \epsilon^{-1})$ for nonconvex optimization. However, the tightness of these methods remains open. In this work, we propose new methods that achieve an improved complexity of $\mathcal{O}( \bar d + \bar d^{1/3} \epsilon^{-3/2})$ and $\mathcal{O}( (\bar d + \bar d^{13/21} \epsilon^{-2/7}) \ln \bar d)$ for nonconvex and convex optimization, respectively, improving best-known results for both setups.

APA

Chen, L., Liu, C., Luo, L. & Zhang, J.. (2026). Faster Newton Methods for Convex and Nonconvex Optimization in Gradient Complexity. Proceedings of Thirty Ninth Conference on Learning Theory, in Proceedings of Machine Learning Research 336:1088-1112 Available from https://proceedings.mlr.press/v336/chen26a.html.

Related Material

Download PDF