[edit]
Parameter-free, Dynamic, and Strongly-Adaptive Online Learning
Proceedings of the 37th International Conference on Machine Learning, PMLR 119:2250-2259, 2020.
Abstract
We provide a new online learning algorithm that for the first time combines several disparate notions of adaptivity. First, our algorithm obtains a “parameter-free” regret bound that adapts to the norm of the comparator and the squared norm of the size of the gradients it observes. Second, it obtains a “strongly-adaptive” regret bound, so that for any given interval of length N, the regret over the interval is ˜O(√N). Finally, our algorithm obtains an optimal “dynamic” regret bound: for any sequence of comparators with path-length P, our algorithm obtains regret ˜O(√PN) over intervals of length N. Our primary technique for achieving these goals is a new method of combining constrained online learning regret bounds that does not rely on an expert meta-algorithm to aggregate learners.