[edit]
Data Dependent Regret Bounds for Online Portfolio Selection with Predicted Returns
Proceedings of The 36th International Conference on Algorithmic Learning Theory, PMLR 272:937-984, 2025.
Abstract
We study data-dependent regret bounds for the Online Portfolio Selection (OPS) problem. As opposed to worst-case bounds that hold uniformly over all sequence of returns, data-dependent bounds adapt to the specific sequence of returns seen by the investor. Consider a market of n assets and T time periods, consisting of the returns r1,…,rT∈Rn+. The regret of our proposed algorithm, Log-Barrier Adaptive-Curvature Online Newton Step (LB-AdaCurv ONS) is bounded by O(min, where R = \max_{t,i,j} \frac{r_t(i)}{r_t(j)} is a data dependent quantity that is not known to the algorithm. Thus, LB-AdaCurv ONS has a worst case regret of O(\sqrt{nT \log T}) while simultaneously having a data-dependent regret of O(nR\log T). Next, we consider the more practical setting of OPS with predicted returns, where the investor has access to predictions that can be incorporated into the portfolio selection process. We propose the Optimistic Expected Utility LB-FTRL (OUE-LB-FTRL) algorithm that incorporates the predictions using a utility function. If the predictions are accurate, OUE-LB-FTRL’s regret is O(n \log T) and even if the predictions are arbitrary, regret is always bounded by O(\sqrt{nT\log T}). We provide a meta-algorithm called Best-of-Both Worlds for OPS (BoB-OPS), that combines the portfolios of an expected utility investor and a regret minimizing investor. By properly instantiating BoB-OPS, we show that the regret with respect to the expected utility investor is O(\log T) and the static regret is O(n \log T). Finally, we also show new First-Order, Second-Order and Gradual-Variation regret bounds for OPS. In our analysis, we developed new regret inequalities for optimistic FTRL with convex hint functions. This extends prior work on optimistic FTRL that used only linear hints, and so could be of independent interest.