Data Dependent Regret Bounds for Online Portfolio Selection with Predicted Returns

Sudeep Raja Putta; Shipra Agrawal

Data Dependent Regret Bounds for Online Portfolio Selection with Predicted Returns

Sudeep Raja Putta, Shipra Agrawal

Proceedings of The 36th International Conference on Algorithmic Learning Theory, PMLR 272:937-984, 2025.

Abstract

We study data-dependent regret bounds for the Online Portfolio Selection (OPS) problem. As opposed to worst-case bounds that hold uniformly over all sequence of returns, data-dependent bounds adapt to the specific sequence of returns seen by the investor. Consider a market of

$n$ assets and

$T$ time periods, consisting of the returns

$r_1,…,r_T \in \mathbb{R}^n_+$ . The regret of our proposed algorithm, Log-Barrier Adaptive-Curvature Online Newton Step (LB-AdaCurv ONS) is bounded by

$O(\min(nR\log T, \sqrt{nT\log T}))$ , where

$R = \max_{t,i,j} \frac{r_t(i)}{r_t(j)}$ is a data dependent quantity that is not known to the algorithm. Thus, LB-AdaCurv ONS has a worst case regret of

$O(\sqrt{nT \log T})$ while simultaneously having a data-dependent regret of

$O(nR\log T)$ . Next, we consider the more practical setting of OPS with predicted returns, where the investor has access to predictions that can be incorporated into the portfolio selection process. We propose the Optimistic Expected Utility LB-FTRL (OUE-LB-FTRL) algorithm that incorporates the predictions using a utility function. If the predictions are accurate, OUE-LB-FTRL’s regret is

$O(n \log T)$ and even if the predictions are arbitrary, regret is always bounded by

$O(\sqrt{nT\log T})$ . We provide a meta-algorithm called Best-of-Both Worlds for OPS (BoB-OPS), that combines the portfolios of an expected utility investor and a regret minimizing investor. By properly instantiating BoB-OPS, we show that the regret with respect to the expected utility investor is

$O(\log T)$ and the static regret is

$O(n \log T)$ . Finally, we also show new First-Order, Second-Order and Gradual-Variation regret bounds for OPS. In our analysis, we developed new regret inequalities for optimistic FTRL with convex hint functions. This extends prior work on optimistic FTRL that used only linear hints, and so could be of independent interest.

Cite this Paper

BibTeX

@InProceedings{pmlr-v272-putta25a,
  title = 	 {Data Dependent Regret Bounds for Online Portfolio Selection with Predicted Returns},
  author =       {Putta, Sudeep Raja and Agrawal, Shipra},
  booktitle = 	 {Proceedings of The 36th International Conference on Algorithmic Learning Theory},
  pages = 	 {937--984},
  year = 	 {2025},
  editor = 	 {Kamath, Gautam and Loh, Po-Ling},
  volume = 	 {272},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {24--27 Feb},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v272/main/assets/putta25a/putta25a.pdf},
  url = 	 {https://proceedings.mlr.press/v272/putta25a.html},
  abstract = 	 {We study data-dependent regret bounds for the Online Portfolio Selection (OPS) problem. As opposed to worst-case bounds that hold uniformly over all sequence of returns, data-dependent bounds adapt to the specific sequence of returns seen by the investor. Consider a market of $n$ assets and $T$ time periods, consisting of the returns $r_1,…,r_T \in \mathbb{R}^n_+$. The regret of our proposed algorithm, Log-Barrier Adaptive-Curvature Online Newton Step (LB-AdaCurv ONS) is bounded by $O(\min(nR\log T, \sqrt{nT\log T}))$, where  $R = \max_{t,i,j} \frac{r_t(i)}{r_t(j)}$ is a data dependent quantity that is not known to the algorithm. Thus, LB-AdaCurv ONS has a worst case regret of $O(\sqrt{nT \log T})$ while simultaneously having a data-dependent regret of $O(nR\log T)$.  Next, we consider the more practical setting of OPS with predicted returns, where the investor has access to predictions that can be incorporated into the portfolio selection process. We propose the Optimistic Expected Utility LB-FTRL (OUE-LB-FTRL) algorithm that incorporates the predictions using a utility function. If the predictions are accurate, OUE-LB-FTRL’s regret is $O(n \log T)$ and even if the predictions are arbitrary, regret is always bounded by $O(\sqrt{nT\log T})$. We provide a meta-algorithm called Best-of-Both Worlds for OPS (BoB-OPS), that combines the portfolios of an expected utility investor and a regret minimizing investor. By properly instantiating BoB-OPS, we show that the regret with respect to the expected utility investor is $O(\log T)$ and the static regret is $O(n \log T)$. Finally, we also show new First-Order, Second-Order and Gradual-Variation regret bounds for OPS. In our analysis, we developed new regret inequalities for optimistic FTRL with convex hint functions. This extends prior work on optimistic FTRL that used only linear hints, and so could be of independent interest.}
}

Endnote

%0 Conference Paper
%T Data Dependent Regret Bounds for Online Portfolio Selection with Predicted Returns
%A Sudeep Raja Putta
%A Shipra Agrawal
%B Proceedings of The 36th International Conference on Algorithmic Learning Theory
%C Proceedings of Machine Learning Research
%D 2025
%E Gautam Kamath
%E Po-Ling Loh	
%F pmlr-v272-putta25a
%I PMLR
%P 937--984
%U https://proceedings.mlr.press/v272/putta25a.html
%V 272
%X We study data-dependent regret bounds for the Online Portfolio Selection (OPS) problem. As opposed to worst-case bounds that hold uniformly over all sequence of returns, data-dependent bounds adapt to the specific sequence of returns seen by the investor. Consider a market of $n$ assets and $T$ time periods, consisting of the returns $r_1,…,r_T \in \mathbb{R}^n_+$. The regret of our proposed algorithm, Log-Barrier Adaptive-Curvature Online Newton Step (LB-AdaCurv ONS) is bounded by $O(\min(nR\log T, \sqrt{nT\log T}))$, where  $R = \max_{t,i,j} \frac{r_t(i)}{r_t(j)}$ is a data dependent quantity that is not known to the algorithm. Thus, LB-AdaCurv ONS has a worst case regret of $O(\sqrt{nT \log T})$ while simultaneously having a data-dependent regret of $O(nR\log T)$.  Next, we consider the more practical setting of OPS with predicted returns, where the investor has access to predictions that can be incorporated into the portfolio selection process. We propose the Optimistic Expected Utility LB-FTRL (OUE-LB-FTRL) algorithm that incorporates the predictions using a utility function. If the predictions are accurate, OUE-LB-FTRL’s regret is $O(n \log T)$ and even if the predictions are arbitrary, regret is always bounded by $O(\sqrt{nT\log T})$. We provide a meta-algorithm called Best-of-Both Worlds for OPS (BoB-OPS), that combines the portfolios of an expected utility investor and a regret minimizing investor. By properly instantiating BoB-OPS, we show that the regret with respect to the expected utility investor is $O(\log T)$ and the static regret is $O(n \log T)$. Finally, we also show new First-Order, Second-Order and Gradual-Variation regret bounds for OPS. In our analysis, we developed new regret inequalities for optimistic FTRL with convex hint functions. This extends prior work on optimistic FTRL that used only linear hints, and so could be of independent interest.

APA

Putta, S.R. & Agrawal, S.. (2025). Data Dependent Regret Bounds for Online Portfolio Selection with Predicted Returns. Proceedings of The 36th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 272:937-984 Available from https://proceedings.mlr.press/v272/putta25a.html.

Data Dependent Regret Bounds for Online Portfolio Selection with Predicted Returns

Abstract

Cite this Paper

Related Material