Cubic regularized subspace Newton for non-convex optimization

Jim Zhao; Nikita Doikov; Aurelien Lucchi

Cubic regularized subspace Newton for non-convex optimization

Jim Zhao, Nikita Doikov, Aurelien Lucchi

Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:811-819, 2025.

Abstract

This paper addresses the optimization problem of minimizing non-convex continuous functions, a problem highly relevant in high-dimensional machine learning scenarios, particularly those involving over-parameterization. We analyze a randomized coordinate second-order method named SSCN, which can be interpreted as applying the cubic regularization of Newton’s method in random subspaces. This approach effectively reduces the computational complexity associated with utilizing second-order information, making it applicable in higher-dimensional scenarios. Theoretically, we establish strong global convergence guarantees for non-convex functions to a stationary point, with interpolating rates for arbitrary subspace sizes and allowing inexact curvature estimation, starting from an arbitrary initialization. When increasing the subspace size, our complexity matches the $\mathcal{O}(\epsilon^{-3/2})$ rate of the full Newton’s method with cubic regularization. Additionally, we propose an adaptive sampling scheme ensuring the exact convergence rate of $\mathcal{O}(\epsilon^{-3/2}, \epsilon^{-3})$ to a second-order stationary point, without requiring to sample all coordinates. Experimental results demonstrate substantial speed-ups achieved by SSCN compared to conventional first-order methods and other second-order subspace methods.

Cite this Paper

BibTeX

@InProceedings{pmlr-v258-zhao25a,
  title = 	 {Cubic regularized subspace Newton for non-convex optimization},
  author =       {Zhao, Jim and Doikov, Nikita and Lucchi, Aurelien},
  booktitle = 	 {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {811--819},
  year = 	 {2025},
  editor = 	 {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz},
  volume = 	 {258},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {03--05 May},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v258/main/assets/zhao25a/zhao25a.pdf},
  url = 	 {https://proceedings.mlr.press/v258/zhao25a.html},
  abstract = 	 {This paper addresses the optimization problem of minimizing non-convex continuous functions, a problem highly relevant in high-dimensional machine learning scenarios, particularly those involving over-parameterization. We analyze a randomized coordinate second-order method named SSCN, which can be interpreted as applying the cubic regularization of Newton’s method in random subspaces. This approach effectively reduces the computational complexity associated with utilizing second-order information, making it applicable in higher-dimensional scenarios. Theoretically, we establish strong global convergence guarantees for non-convex functions to a stationary point, with interpolating rates for arbitrary subspace sizes and allowing inexact curvature estimation, starting from an arbitrary initialization. When increasing the subspace size, our complexity matches the $\mathcal{O}(\epsilon^{-3/2})$  rate of the full Newton’s method with cubic regularization. Additionally, we propose an adaptive sampling scheme ensuring the exact convergence rate of $\mathcal{O}(\epsilon^{-3/2}, \epsilon^{-3})$ to a second-order stationary point, without requiring to sample all coordinates. Experimental results demonstrate substantial speed-ups achieved by SSCN  compared to conventional first-order methods and other second-order subspace methods.}
}

Endnote

%0 Conference Paper
%T Cubic regularized subspace Newton for non-convex optimization
%A Jim Zhao
%A Nikita Doikov
%A Aurelien Lucchi
%B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2025
%E Yingzhen Li
%E Stephan Mandt
%E Shipra Agrawal
%E Emtiyaz Khan	
%F pmlr-v258-zhao25a
%I PMLR
%P 811--819
%U https://proceedings.mlr.press/v258/zhao25a.html
%V 258
%X This paper addresses the optimization problem of minimizing non-convex continuous functions, a problem highly relevant in high-dimensional machine learning scenarios, particularly those involving over-parameterization. We analyze a randomized coordinate second-order method named SSCN, which can be interpreted as applying the cubic regularization of Newton’s method in random subspaces. This approach effectively reduces the computational complexity associated with utilizing second-order information, making it applicable in higher-dimensional scenarios. Theoretically, we establish strong global convergence guarantees for non-convex functions to a stationary point, with interpolating rates for arbitrary subspace sizes and allowing inexact curvature estimation, starting from an arbitrary initialization. When increasing the subspace size, our complexity matches the $\mathcal{O}(\epsilon^{-3/2})$  rate of the full Newton’s method with cubic regularization. Additionally, we propose an adaptive sampling scheme ensuring the exact convergence rate of $\mathcal{O}(\epsilon^{-3/2}, \epsilon^{-3})$ to a second-order stationary point, without requiring to sample all coordinates. Experimental results demonstrate substantial speed-ups achieved by SSCN  compared to conventional first-order methods and other second-order subspace methods.

APA

Zhao, J., Doikov, N. & Lucchi, A.. (2025). Cubic regularized subspace Newton for non-convex optimization. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:811-819 Available from https://proceedings.mlr.press/v258/zhao25a.html.

Cubic regularized subspace Newton for non-convex optimization

Abstract

Cite this Paper

Related Material