[edit]
Clipping the Price of Adaptivity at the Tail
Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:4267-4307, 2026.
Abstract
Adaptive stochastic convex optimization (SCO) methods face a fundamental “price of adaptivity” barrier: under the standard set of assumptions, they cannot efficiently adapt to large uncertainty in both the initial distance to optimality and the Lipschitz constant. We circumvent this barrier by requiring a small amount of additional structure common to many learning problems. Specifically, we assume that the objective decomposes into a model and a loss function, enabling us to intervene by modifying the model’s output before it passes to the loss function. Under this assumption, we design a method that clips the learned model output in tail events where it deviates too much from the output of a fixed reference model. Our method matches the optimal bounds for known-parameter SCO up to logarithmic factors in the uncertainty in the distance and Lipschitz parameters, thus efficiently adapting to large uncertainty in both.