[edit]
Online Learning in Dynamically Changing Environments
Proceedings of Thirty Sixth Conference on Learning Theory, PMLR 195:325-358, 2023.
Abstract
We study the problem of online learning and online regret minimization when samples are drawn from a general unknown \emph{non-stationary} process. We introduce the concept of a \emph{dynamic changing process} with cost K, where the \emph{conditional} marginals of the process can vary arbitrarily, but that the number of different conditional marginals is bounded by K over T rounds. For such processes we prove a tight (upto √logT factor) bound O(√KT⋅\vchlogT) for the \emph{expected worst case} regret of any finite VC-dimensional class H under absolute loss (i.e., the expected miss-classification loss). We then improve this bound for general mixable losses, by establishing a tight (up to log3T factor) regret bound O(K⋅\vchlog3T). We extend these results to general \emph{smooth adversary} processes with \emph{unknown} reference measure by showing a sub-linear regret bound for 1-dimensional threshold functions under a general bounded convex loss. Our results can be viewed as a first step towards regret analysis with non-stationary samples in the \emph{distribution blind} (universal) regime. This also brings a new viewpoint that shifts the study of complexity of the hypothesis classes to the study of the complexity of processes generating data.