Sharpened Lazy Incremental Quasi-Newton Method

Aakash Sunil Lahoti; Spandan Senapati; Ketan Rajawat; Alec Koppel

Sharpened Lazy Incremental Quasi-Newton Method

Aakash Sunil Lahoti, Spandan Senapati, Ketan Rajawat, Alec Koppel

Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, PMLR 238:4735-4743, 2024.

Abstract

The problem of minimizing the sum of

$n$ functions in

$d$ dimensions is ubiquitous in machine learning and statistics. In many applications where the number of observations

$n$ is large, it is necessary to use incremental or stochastic methods, as their per-iteration cost is independent of

$n$ . Of these, Quasi-Newton (QN) methods strike a balance between the per-iteration cost and the convergence rate. Specifically, they exhibit a superlinear rate with

$O(d^2)$ cost in contrast to the linear rate of first-order methods with

$O(d)$ cost and the quadratic rate of second-order methods with

$O(d^3)$ cost. However, existing incremental methods have notable shortcomings: Incremental Quasi-Newton (IQN) only exhibits asymptotic superlinear convergence. In contrast, Incremental Greedy BFGS (IGS) offers explicit superlinear convergence but suffers from poor empirical performance and has a per-iteration cost of

$O(d^3)$ . To address these issues, we introduce the Sharpened Lazy Incremental Quasi-Newton Method (SLIQN) that achieves the best of both worlds: an explicit superlinear convergence rate, and superior empirical performance at a per-iteration

$O(d^2)$ cost. SLIQN features two key changes: first, it incorporates a hybrid strategy of using both classic and greedy BFGS updates, allowing it to empirically outperform both IQN and IGS. Second, it employs a clever constant multiplicative factor along with a lazy propagation strategy, which enables it to have a cost of

$O(d^2)$ . Additionally, our experiments demonstrate the superiority of SLIQN over other incremental and stochastic Quasi-Newton variants and establish its competitiveness with second-order incremental methods.

Cite this Paper

BibTeX

@InProceedings{pmlr-v238-sunil-lahoti24a,
  title = 	 {Sharpened Lazy Incremental Quasi-{N}ewton Method},
  author =       {Sunil Lahoti, Aakash and Senapati, Spandan and Rajawat, Ketan and Koppel, Alec},
  booktitle = 	 {Proceedings of The 27th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {4735--4743},
  year = 	 {2024},
  editor = 	 {Dasgupta, Sanjoy and Mandt, Stephan and Li, Yingzhen},
  volume = 	 {238},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {02--04 May},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v238/sunil-lahoti24a/sunil-lahoti24a.pdf},
  url = 	 {https://proceedings.mlr.press/v238/sunil-lahoti24a.html},
  abstract = 	 {The problem of minimizing the sum of $n$ functions in $d$ dimensions is ubiquitous in machine learning and statistics. In many applications where the number of observations $n$ is large, it is necessary to use incremental or stochastic methods, as their per-iteration cost is independent of $n$. Of these, Quasi-Newton (QN) methods strike a balance between the per-iteration cost and the convergence rate. Specifically, they exhibit a superlinear rate with $O(d^2)$ cost in contrast to the linear rate of first-order methods with $O(d)$ cost and the quadratic rate of second-order methods with $O(d^3)$ cost. However, existing incremental methods have notable shortcomings: Incremental Quasi-Newton (IQN) only exhibits asymptotic superlinear convergence. In contrast, Incremental Greedy BFGS (IGS) offers explicit superlinear convergence but suffers from poor empirical performance and has a per-iteration cost of $O(d^3)$. To address these issues, we introduce the Sharpened Lazy Incremental Quasi-Newton Method (SLIQN) that achieves the best of both worlds: an explicit superlinear convergence rate, and superior empirical performance at a per-iteration $O(d^2)$ cost. SLIQN features two key changes: first, it incorporates a hybrid strategy of using both classic and greedy BFGS updates, allowing it to empirically outperform both IQN and IGS. Second, it employs a clever constant multiplicative factor along with a lazy propagation strategy, which enables it to have a cost of $O(d^2)$. Additionally, our experiments demonstrate the superiority of SLIQN over other incremental and stochastic Quasi-Newton variants and establish its competitiveness with second-order incremental methods.}
}

Endnote

%0 Conference Paper
%T Sharpened Lazy Incremental Quasi-Newton Method
%A Aakash Sunil Lahoti
%A Spandan Senapati
%A Ketan Rajawat
%A Alec Koppel
%B Proceedings of The 27th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2024
%E Sanjoy Dasgupta
%E Stephan Mandt
%E Yingzhen Li	
%F pmlr-v238-sunil-lahoti24a
%I PMLR
%P 4735--4743
%U https://proceedings.mlr.press/v238/sunil-lahoti24a.html
%V 238
%X The problem of minimizing the sum of $n$ functions in $d$ dimensions is ubiquitous in machine learning and statistics. In many applications where the number of observations $n$ is large, it is necessary to use incremental or stochastic methods, as their per-iteration cost is independent of $n$. Of these, Quasi-Newton (QN) methods strike a balance between the per-iteration cost and the convergence rate. Specifically, they exhibit a superlinear rate with $O(d^2)$ cost in contrast to the linear rate of first-order methods with $O(d)$ cost and the quadratic rate of second-order methods with $O(d^3)$ cost. However, existing incremental methods have notable shortcomings: Incremental Quasi-Newton (IQN) only exhibits asymptotic superlinear convergence. In contrast, Incremental Greedy BFGS (IGS) offers explicit superlinear convergence but suffers from poor empirical performance and has a per-iteration cost of $O(d^3)$. To address these issues, we introduce the Sharpened Lazy Incremental Quasi-Newton Method (SLIQN) that achieves the best of both worlds: an explicit superlinear convergence rate, and superior empirical performance at a per-iteration $O(d^2)$ cost. SLIQN features two key changes: first, it incorporates a hybrid strategy of using both classic and greedy BFGS updates, allowing it to empirically outperform both IQN and IGS. Second, it employs a clever constant multiplicative factor along with a lazy propagation strategy, which enables it to have a cost of $O(d^2)$. Additionally, our experiments demonstrate the superiority of SLIQN over other incremental and stochastic Quasi-Newton variants and establish its competitiveness with second-order incremental methods.

APA

Sunil Lahoti, A., Senapati, S., Rajawat, K. & Koppel, A.. (2024). Sharpened Lazy Incremental Quasi-Newton Method. Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 238:4735-4743 Available from https://proceedings.mlr.press/v238/sunil-lahoti24a.html.

Sharpened Lazy Incremental Quasi-Newton Method

Abstract

Cite this Paper

Related Material