Second-Order Optimization with Lazy Hessians

Nikita Doikov, El Mahdi Chayti, Martin Jaggi
Proceedings of the 40th International Conference on Machine Learning, PMLR 202:8138-8161, 2023.

Abstract

We analyze Newton’s method with lazy Hessian updates for solving general possibly non-convex optimization problems. We propose to reuse a previously seen Hessian for several iterations while computing new gradients at each step of the method. This significantly reduces the overall arithmetic complexity of second-order optimization schemes. By using the cubic regularization technique, we establish fast global convergence of our method to a second-order stationary point, while the Hessian does not need to be updated each iteration. For convex problems, we justify global and local superlinear rates for lazy Newton steps with quadratic regularization, which is easier to compute. The optimal frequency for updating the Hessian is once every $d$ iterations, where $d$ is the dimension of the problem. This provably improves the total arithmetic complexity of second-order algorithms by a factor $\sqrt{d}$.

Cite this Paper


BibTeX
@InProceedings{pmlr-v202-doikov23a, title = {Second-Order Optimization with Lazy Hessians}, author = {Doikov, Nikita and Chayti, El Mahdi and Jaggi, Martin}, booktitle = {Proceedings of the 40th International Conference on Machine Learning}, pages = {8138--8161}, year = {2023}, editor = {Krause, Andreas and Brunskill, Emma and Cho, Kyunghyun and Engelhardt, Barbara and Sabato, Sivan and Scarlett, Jonathan}, volume = {202}, series = {Proceedings of Machine Learning Research}, month = {23--29 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v202/doikov23a/doikov23a.pdf}, url = {https://proceedings.mlr.press/v202/doikov23a.html}, abstract = {We analyze Newton’s method with lazy Hessian updates for solving general possibly non-convex optimization problems. We propose to reuse a previously seen Hessian for several iterations while computing new gradients at each step of the method. This significantly reduces the overall arithmetic complexity of second-order optimization schemes. By using the cubic regularization technique, we establish fast global convergence of our method to a second-order stationary point, while the Hessian does not need to be updated each iteration. For convex problems, we justify global and local superlinear rates for lazy Newton steps with quadratic regularization, which is easier to compute. The optimal frequency for updating the Hessian is once every $d$ iterations, where $d$ is the dimension of the problem. This provably improves the total arithmetic complexity of second-order algorithms by a factor $\sqrt{d}$.} }
Endnote
%0 Conference Paper %T Second-Order Optimization with Lazy Hessians %A Nikita Doikov %A El Mahdi Chayti %A Martin Jaggi %B Proceedings of the 40th International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2023 %E Andreas Krause %E Emma Brunskill %E Kyunghyun Cho %E Barbara Engelhardt %E Sivan Sabato %E Jonathan Scarlett %F pmlr-v202-doikov23a %I PMLR %P 8138--8161 %U https://proceedings.mlr.press/v202/doikov23a.html %V 202 %X We analyze Newton’s method with lazy Hessian updates for solving general possibly non-convex optimization problems. We propose to reuse a previously seen Hessian for several iterations while computing new gradients at each step of the method. This significantly reduces the overall arithmetic complexity of second-order optimization schemes. By using the cubic regularization technique, we establish fast global convergence of our method to a second-order stationary point, while the Hessian does not need to be updated each iteration. For convex problems, we justify global and local superlinear rates for lazy Newton steps with quadratic regularization, which is easier to compute. The optimal frequency for updating the Hessian is once every $d$ iterations, where $d$ is the dimension of the problem. This provably improves the total arithmetic complexity of second-order algorithms by a factor $\sqrt{d}$.
APA
Doikov, N., Chayti, E.M. & Jaggi, M.. (2023). Second-Order Optimization with Lazy Hessians. Proceedings of the 40th International Conference on Machine Learning, in Proceedings of Machine Learning Research 202:8138-8161 Available from https://proceedings.mlr.press/v202/doikov23a.html.

Related Material