A simple, optimal and efficient algorithm for online exp-concave optimization

Yi-Han Wang; Peng Zhao; Zhi-Hua Zhou

A simple, optimal and efficient algorithm for online exp-concave optimization

Yi-Han Wang, Peng Zhao, Zhi-Hua Zhou

Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:6651-6691, 2026.

Abstract

Online eXp-concave Optimization (OXO) is a fundamental problem in online learning, where the goal is to minimize regret when loss functions are exponentially concave. The standard algorithm, Online Newton Step (ONS), guarantees an optimal $O(d \log T)$ regret, where $d$ is the dimension and $T$ is the time horizon. Despite its simplicity, ONS may face a computational bottleneck due to the \emph{Mahalanobis projection} at each round. This step costs $\Omega(d^\omega)$ arithmetic operations for bounded domains, even for simple domains such as the unit ball, where $\omega \in (2,3]$ is the matrix-multiplication exponent. As a result, the total runtime can reach $\tilde{O}(d^\omega T)$, particularly when iterates frequently oscillate near the domain boundary. This paper proposes a simple variant of ONS, called LightONS, which reduces the total runtime to $O(d^2 T + d^\omega \sqrt{T \log T})$ while preserving the optimal regret. Deploying LightONS with the online-to-batch conversion implies a method for stochastic exp-concave optimization with runtime $\tilde{O}(d^3/\varepsilon)$, thereby answering an open problem posed by Koren [2013]. The design leverages domain-conversion techniques from parameter-free online learning and defers expensive Mahalanobis projections until necessary, thereby preserving the elegant structure of ONS and enabling LightONS to act as an efficient plug-in replacement in broader scenarios, including gradient-norm adaptivity, parametric stochastic bandits, and memory-efficient OXO.

Cite this Paper

BibTeX

@InProceedings{pmlr-v336-wang26b,
  title = 	 {A simple, optimal and efficient algorithm for online exp-concave optimization},
  author =       {Wang, Yi-Han and Zhao, Peng and Zhou, Zhi-Hua},
  booktitle = 	 {Proceedings of Thirty Ninth Conference on Learning Theory},
  pages = 	 {6651--6691},
  year = 	 {2026},
  editor = 	 {Hanneke, Steve and Lattimore, Tor},
  volume = 	 {336},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {29 Jun--03 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v336/main/assets/wang26b/wang26b.pdf},
  url = 	 {https://proceedings.mlr.press/v336/wang26b.html},
  abstract = 	 { Online eXp-concave Optimization (OXO) is a fundamental problem in online learning, where the goal is to minimize regret when loss functions are exponentially concave. The standard algorithm, Online Newton Step (ONS), guarantees an optimal $O(d \log T)$ regret, where $d$ is the dimension and $T$ is the time horizon. Despite its simplicity, ONS may face a computational bottleneck due to the \emph{Mahalanobis projection} at each round. This step costs $\Omega(d^\omega)$ arithmetic operations for bounded domains, even for simple domains such as the unit ball, where $\omega \in (2,3]$ is the matrix-multiplication exponent. As a result, the total runtime can reach $\tilde{O}(d^\omega T)$, particularly when iterates frequently oscillate near the domain boundary. This paper proposes a simple variant of ONS, called LightONS, which reduces the total runtime to $O(d^2 T + d^\omega \sqrt{T \log T})$ while preserving the optimal regret. Deploying LightONS with the online-to-batch conversion implies a method for stochastic exp-concave optimization with runtime $\tilde{O}(d^3/\varepsilon)$, thereby answering an open problem posed by Koren [2013]. The design leverages domain-conversion techniques from parameter-free online learning and defers expensive Mahalanobis projections until necessary, thereby preserving the elegant structure of ONS and enabling LightONS to act as an efficient plug-in replacement in broader scenarios, including gradient-norm adaptivity, parametric stochastic bandits, and memory-efficient OXO. }
}

Endnote

%0 Conference Paper
%T A simple, optimal and efficient algorithm for online exp-concave optimization
%A Yi-Han Wang
%A Peng Zhao
%A Zhi-Hua Zhou
%B Proceedings of Thirty Ninth Conference on Learning Theory
%C Proceedings of Machine Learning Research
%D 2026
%E Steve Hanneke
%E Tor Lattimore	
%F pmlr-v336-wang26b
%I PMLR
%P 6651--6691
%U https://proceedings.mlr.press/v336/wang26b.html
%V 336
%X  Online eXp-concave Optimization (OXO) is a fundamental problem in online learning, where the goal is to minimize regret when loss functions are exponentially concave. The standard algorithm, Online Newton Step (ONS), guarantees an optimal $O(d \log T)$ regret, where $d$ is the dimension and $T$ is the time horizon. Despite its simplicity, ONS may face a computational bottleneck due to the \emph{Mahalanobis projection} at each round. This step costs $\Omega(d^\omega)$ arithmetic operations for bounded domains, even for simple domains such as the unit ball, where $\omega \in (2,3]$ is the matrix-multiplication exponent. As a result, the total runtime can reach $\tilde{O}(d^\omega T)$, particularly when iterates frequently oscillate near the domain boundary. This paper proposes a simple variant of ONS, called LightONS, which reduces the total runtime to $O(d^2 T + d^\omega \sqrt{T \log T})$ while preserving the optimal regret. Deploying LightONS with the online-to-batch conversion implies a method for stochastic exp-concave optimization with runtime $\tilde{O}(d^3/\varepsilon)$, thereby answering an open problem posed by Koren [2013]. The design leverages domain-conversion techniques from parameter-free online learning and defers expensive Mahalanobis projections until necessary, thereby preserving the elegant structure of ONS and enabling LightONS to act as an efficient plug-in replacement in broader scenarios, including gradient-norm adaptivity, parametric stochastic bandits, and memory-efficient OXO.

APA

Wang, Y., Zhao, P. & Zhou, Z.. (2026). A simple, optimal and efficient algorithm for online exp-concave optimization. Proceedings of Thirty Ninth Conference on Learning Theory, in Proceedings of Machine Learning Research 336:6651-6691 Available from https://proceedings.mlr.press/v336/wang26b.html.

Related Material

Download PDF