[edit]
Pseudonorm Approachability and Applications to Regret Minimization
Proceedings of The 34th International Conference on Algorithmic Learning Theory, PMLR 201:471-509, 2023.
Abstract
Blackwell’s celebrated approachability theory provides a general framework for a variety of learning problems, including regret minimization. However, Blackwell’s proof and implicit algorithm measure approachability using the ℓ2 (Euclidean) distance. We argue that in many applications such as regret minimization, it is more useful to study approachability under other distance metrics, most commonly the ℓ∞-metric. But, the time and space complexity of the algorithms designed for ℓ∞-approachability depend on the dimension of the space of the vectorial payoffs, which is often prohibitively large. Thus, we present a framework for converting high-dimensional ℓ∞-approachability problems to low-dimensional \emph{pseudonorm} approachability problems, thereby resolving such issues. We first show that the ℓ∞-distance between the average payoff and the approachability set can be equivalently defined as a \emph{pseudodistance} between a lower-dimensional average vector payoff and a new convex set we define. Next, we develop an algorithmic theory of pseudonorm approachability, analogous to previous work on approachability for ℓ2 and other norms, showing that it can be achieved via online linear optimization (OLO) over a convex set given by the Fenchel dual of the unit pseudonorm ball. We then use that to show, modulo mild normalization assumptions, that there exists an ℓ∞-approachability algorithm whose convergence is independent of the dimension of the original vectorial payoff. We further show that that algorithm admits a polynomial-time complexity, assuming that the original ℓ∞-distance can be computed efficiently. We also give an ℓ∞-approachability algorithm whose convergence is logarithmic in that dimension using an FTRL algorithm with a maximum-entropy regularizer. Finally, we illustrate the benefits of our framework by applying it to several problems in regret minimization.