Krylov Cubic Regularized Newton: A Subspace Second-Order Method with Dimension-Free Convergence Rate

Ruichen Jiang; Parameswaran Raman; Shoham Sabach; Aryan Mokhtari; Mingyi Hong; Volkan Cevher

Krylov Cubic Regularized Newton: A Subspace Second-Order Method with Dimension-Free Convergence Rate

Ruichen Jiang, Parameswaran Raman, Shoham Sabach, Aryan Mokhtari, Mingyi Hong, Volkan Cevher

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:4411-4419, 2024.

Abstract

Second-order optimization methods, such as cubic regularized Newton methods, are known for their rapid convergence rates; nevertheless, they become impractical in high-dimensional problems due to their substantial memory requirements and computational costs. One promising approach is to execute second-order updates within a lower-dimensional subspace, giving rise to subspace second-order methods. However, the majority of existing subspace second-order methods randomly select subspaces, consequently resulting in slower convergence rates depending on the problem’s dimension

$d$ . In this paper, we introduce a novel subspace cubic regularized Newton method that achieves a dimension-independent global convergence rate of

$\mathcal{O}\left(\frac{1}{mk}+\frac{1}{k^2}\right)$ for solving convex optimization problems. Here,

$m$ represents the subspace dimension, which can be significantly smaller than

$d$ . Instead of adopting a random subspace, our primary innovation involves performing the cubic regularized Newton update within the \emph{Krylov subspace} associated with the Hessian and the gradient of the objective function. This result marks the first instance of a dimension-independent convergence rate for a subspace second-order method. Furthermore, when specific spectral conditions of the Hessian are met, our method recovers the convergence rate of a full-dimensional cubic regularized Newton method. Numerical experiments show our method converges faster than existing random subspace methods, especially for high-dimensional problems.

Cite this Paper

BibTeX

@InProceedings{pmlr-v238-jiang24a,
  title = 	 {Krylov Cubic Regularized {N}ewton: A Subspace Second-Order Method with Dimension-Free Convergence Rate},
  author =       {Jiang, Ruichen and Raman, Parameswaran and Sabach, Shoham and Mokhtari, Aryan and Hong, Mingyi and Cevher, Volkan},
  booktitle = 	 {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {4411--4419},
  year = 	 {2024},
  editor = 	 {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen},
  volume = 	 {238},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {02--04 May},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v238/jiang24a/jiang24a.pdf},
  url = 	 {https://proceedings.mlr.press/v238/jiang24a.html},
  abstract = 	 {Second-order optimization methods, such as cubic regularized Newton methods, are known for their rapid convergence rates; nevertheless, they become impractical in high-dimensional problems due to their substantial memory requirements and computational costs. One promising approach is to execute second-order updates within a lower-dimensional subspace, giving rise to subspace second-order methods. However, the majority of existing subspace second-order methods randomly select subspaces, consequently resulting in slower convergence rates depending on the problem’s dimension $d$. In this paper, we introduce a novel subspace cubic regularized Newton method that achieves a dimension-independent global convergence rate of $\mathcal{O}\left(\frac{1}{mk}+\frac{1}{k^2}\right)$ for solving convex optimization problems. Here, $m$ represents the subspace dimension, which can be significantly smaller than $d$. Instead of adopting a random subspace, our primary innovation involves performing the cubic regularized Newton update within the \emph{Krylov subspace} associated with the Hessian and the gradient of the objective function. This result marks the first instance of a dimension-independent convergence rate for a subspace second-order method. Furthermore, when specific spectral conditions of the Hessian are met, our method recovers the convergence rate of a full-dimensional cubic regularized Newton method. Numerical experiments show our method converges faster than existing random subspace methods, especially for high-dimensional problems.}
}

Endnote

%0 Conference Paper
%T Krylov Cubic Regularized Newton: A Subspace Second-Order Method with Dimension-Free Convergence Rate
%A Ruichen Jiang
%A Parameswaran Raman
%A Shoham Sabach
%A Aryan Mokhtari
%A Mingyi Hong
%A Volkan Cevher
%B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2024
%E Sanjoy Dasgupta
%E Stephan Mandt
%E Yingzhen Li	
%F pmlr-v238-jiang24a
%I PMLR
%P 4411--4419
%U https://proceedings.mlr.press/v238/jiang24a.html
%V 238
%X Second-order optimization methods, such as cubic regularized Newton methods, are known for their rapid convergence rates; nevertheless, they become impractical in high-dimensional problems due to their substantial memory requirements and computational costs. One promising approach is to execute second-order updates within a lower-dimensional subspace, giving rise to subspace second-order methods. However, the majority of existing subspace second-order methods randomly select subspaces, consequently resulting in slower convergence rates depending on the problem’s dimension $d$. In this paper, we introduce a novel subspace cubic regularized Newton method that achieves a dimension-independent global convergence rate of $\mathcal{O}\left(\frac{1}{mk}+\frac{1}{k^2}\right)$ for solving convex optimization problems. Here, $m$ represents the subspace dimension, which can be significantly smaller than $d$. Instead of adopting a random subspace, our primary innovation involves performing the cubic regularized Newton update within the \emph{Krylov subspace} associated with the Hessian and the gradient of the objective function. This result marks the first instance of a dimension-independent convergence rate for a subspace second-order method. Furthermore, when specific spectral conditions of the Hessian are met, our method recovers the convergence rate of a full-dimensional cubic regularized Newton method. Numerical experiments show our method converges faster than existing random subspace methods, especially for high-dimensional problems.

APA

Jiang, R., Raman, P., Sabach, S., Mokhtari, A., Hong, M. & Cevher, V.. (2024). Krylov Cubic Regularized Newton: A Subspace Second-Order Method with Dimension-Free Convergence Rate. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:4411-4419 Available from https://proceedings.mlr.press/v238/jiang24a.html.

Krylov Cubic Regularized Newton: A Subspace Second-Order Method with Dimension-Free Convergence Rate

Abstract

Cite this Paper

Related Material