Quasi-Newton Steps for Efficient Online Exp-Concave Optimization

Zakaria Mhammedi, Khashayar Gatmiry
Proceedings of Thirty Sixth Conference on Learning Theory, PMLR 195:4473-4503, 2023.

Abstract

The aim of this paper is to design computationally-efficient and optimal algorithms for the online and stochastic exp-concave optimization settings. Typical algorithms for these settings, such as the Online Newton Step (ONS), can guarantee a $O(d\ln T)$ bound on their regret after $T$ rounds, where $d$ is the dimension of the feasible set. However, such algorithms perform so-called \emph{generalized projections} whenever their iterates step outside the feasible set. Such generalized projections require $\Omega(d^3)$ arithmetic operations even for simple sets such a Euclidean ball, making the total runtime of ONS of order $d^3 T$ after $T$ rounds, in the worst-case. In this paper, we side-step generalized projections by using a self-concordant barrier as a regularizer to compute the Newton steps. This ensures that the iterates are always within the feasible set without requiring projections. This approach still requires the computation of the inverse of the Hessian of the barrier at every step. However, using stability properties of the Newton iterates, we show that the inverse of the Hessians can be efficiently approximated via Taylor expansions for most rounds, resulting in a $\widetilde O(d^2 T +d^\omega \sqrt{T})$ total computational complexity, where $\omega\in(2,3]$ is the exponent of matrix multiplication. In the stochastic setting, we show that this translates into a $\widetilde O(d^3/\varepsilon)$ computational complexity for finding an $\varepsilon$-optimal point, answering an open question by Koren 2013. We first prove these new results for the simple case where the feasible set is a Euclidean ball. Then, to move to general convex sets, we use a reduction to Online Convex Optimization over the Euclidean ball. Our final algorithm for general convex sets can be viewed as a more computationally-efficient version of ONS.

Cite this Paper


BibTeX
@InProceedings{pmlr-v195-mhammedi23a, title = {Quasi-Newton Steps for Efficient Online Exp-Concave Optimization}, author = {Mhammedi, Zakaria and Gatmiry, Khashayar}, booktitle = {Proceedings of Thirty Sixth Conference on Learning Theory}, pages = {4473--4503}, year = {2023}, editor = {Neu, Gergely and Rosasco, Lorenzo}, volume = {195}, series = {Proceedings of Machine Learning Research}, month = {12--15 Jul}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v195/mhammedi23a/mhammedi23a.pdf}, url = {https://proceedings.mlr.press/v195/mhammedi23a.html}, abstract = {The aim of this paper is to design computationally-efficient and optimal algorithms for the online and stochastic exp-concave optimization settings. Typical algorithms for these settings, such as the Online Newton Step (ONS), can guarantee a $O(d\ln T)$ bound on their regret after $T$ rounds, where $d$ is the dimension of the feasible set. However, such algorithms perform so-called \emph{generalized projections} whenever their iterates step outside the feasible set. Such generalized projections require $\Omega(d^3)$ arithmetic operations even for simple sets such a Euclidean ball, making the total runtime of ONS of order $d^3 T$ after $T$ rounds, in the worst-case. In this paper, we side-step generalized projections by using a self-concordant barrier as a regularizer to compute the Newton steps. This ensures that the iterates are always within the feasible set without requiring projections. This approach still requires the computation of the inverse of the Hessian of the barrier at every step. However, using stability properties of the Newton iterates, we show that the inverse of the Hessians can be efficiently approximated via Taylor expansions for most rounds, resulting in a $\widetilde O(d^2 T +d^\omega \sqrt{T})$ total computational complexity, where $\omega\in(2,3]$ is the exponent of matrix multiplication. In the stochastic setting, we show that this translates into a $\widetilde O(d^3/\varepsilon)$ computational complexity for finding an $\varepsilon$-optimal point, answering an open question by Koren 2013. We first prove these new results for the simple case where the feasible set is a Euclidean ball. Then, to move to general convex sets, we use a reduction to Online Convex Optimization over the Euclidean ball. Our final algorithm for general convex sets can be viewed as a more computationally-efficient version of ONS.} }
Endnote
%0 Conference Paper %T Quasi-Newton Steps for Efficient Online Exp-Concave Optimization %A Zakaria Mhammedi %A Khashayar Gatmiry %B Proceedings of Thirty Sixth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2023 %E Gergely Neu %E Lorenzo Rosasco %F pmlr-v195-mhammedi23a %I PMLR %P 4473--4503 %U https://proceedings.mlr.press/v195/mhammedi23a.html %V 195 %X The aim of this paper is to design computationally-efficient and optimal algorithms for the online and stochastic exp-concave optimization settings. Typical algorithms for these settings, such as the Online Newton Step (ONS), can guarantee a $O(d\ln T)$ bound on their regret after $T$ rounds, where $d$ is the dimension of the feasible set. However, such algorithms perform so-called \emph{generalized projections} whenever their iterates step outside the feasible set. Such generalized projections require $\Omega(d^3)$ arithmetic operations even for simple sets such a Euclidean ball, making the total runtime of ONS of order $d^3 T$ after $T$ rounds, in the worst-case. In this paper, we side-step generalized projections by using a self-concordant barrier as a regularizer to compute the Newton steps. This ensures that the iterates are always within the feasible set without requiring projections. This approach still requires the computation of the inverse of the Hessian of the barrier at every step. However, using stability properties of the Newton iterates, we show that the inverse of the Hessians can be efficiently approximated via Taylor expansions for most rounds, resulting in a $\widetilde O(d^2 T +d^\omega \sqrt{T})$ total computational complexity, where $\omega\in(2,3]$ is the exponent of matrix multiplication. In the stochastic setting, we show that this translates into a $\widetilde O(d^3/\varepsilon)$ computational complexity for finding an $\varepsilon$-optimal point, answering an open question by Koren 2013. We first prove these new results for the simple case where the feasible set is a Euclidean ball. Then, to move to general convex sets, we use a reduction to Online Convex Optimization over the Euclidean ball. Our final algorithm for general convex sets can be viewed as a more computationally-efficient version of ONS.
APA
Mhammedi, Z. & Gatmiry, K.. (2023). Quasi-Newton Steps for Efficient Online Exp-Concave Optimization. Proceedings of Thirty Sixth Conference on Learning Theory, in Proceedings of Machine Learning Research 195:4473-4503 Available from https://proceedings.mlr.press/v195/mhammedi23a.html.

Related Material