Clipping the Price of Adaptivity at the Tail

Itai Kreisler, Yair Carmon, Oliver Hinder
Proceedings of Thirty Ninth Conference on Learning Theory, PMLR 336:4267-4307, 2026.

Abstract

Adaptive stochastic convex optimization (SCO) methods face a fundamental “price of adaptivity” barrier: under the standard set of assumptions, they cannot efficiently adapt to large uncertainty in both the initial distance to optimality and the Lipschitz constant. We circumvent this barrier by requiring a small amount of additional structure common to many learning problems. Specifically, we assume that the objective decomposes into a model and a loss function, enabling us to intervene by modifying the model’s output before it passes to the loss function. Under this assumption, we design a method that clips the learned model output in tail events where it deviates too much from the output of a fixed reference model. Our method matches the optimal bounds for known-parameter SCO up to logarithmic factors in the uncertainty in the distance and Lipschitz parameters, thus efficiently adapting to large uncertainty in both.

Cite this Paper


BibTeX
@InProceedings{pmlr-v336-kreisler26a, title = {Clipping the Price of Adaptivity at the Tail}, author = {Kreisler, Itai and Carmon, Yair and Hinder, Oliver}, booktitle = {Proceedings of Thirty Ninth Conference on Learning Theory}, pages = {4267--4307}, year = {2026}, editor = {Hanneke, Steve and Lattimore, Tor}, volume = {336}, series = {Proceedings of Machine Learning Research}, month = {29 Jun--03 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v336/main/assets/kreisler26a/kreisler26a.pdf}, url = {https://proceedings.mlr.press/v336/kreisler26a.html}, abstract = {Adaptive stochastic convex optimization (SCO) methods face a fundamental “price of adaptivity” barrier: under the standard set of assumptions, they cannot efficiently adapt to large uncertainty in both the initial distance to optimality and the Lipschitz constant. We circumvent this barrier by requiring a small amount of additional structure common to many learning problems. Specifically, we assume that the objective decomposes into a model and a loss function, enabling us to intervene by modifying the model’s output before it passes to the loss function. Under this assumption, we design a method that clips the learned model output in tail events where it deviates too much from the output of a fixed reference model. Our method matches the optimal bounds for known-parameter SCO up to logarithmic factors in the uncertainty in the distance and Lipschitz parameters, thus efficiently adapting to large uncertainty in both.} }
Endnote
%0 Conference Paper %T Clipping the Price of Adaptivity at the Tail %A Itai Kreisler %A Yair Carmon %A Oliver Hinder %B Proceedings of Thirty Ninth Conference on Learning Theory %C Proceedings of Machine Learning Research %D 2026 %E Steve Hanneke %E Tor Lattimore %F pmlr-v336-kreisler26a %I PMLR %P 4267--4307 %U https://proceedings.mlr.press/v336/kreisler26a.html %V 336 %X Adaptive stochastic convex optimization (SCO) methods face a fundamental “price of adaptivity” barrier: under the standard set of assumptions, they cannot efficiently adapt to large uncertainty in both the initial distance to optimality and the Lipschitz constant. We circumvent this barrier by requiring a small amount of additional structure common to many learning problems. Specifically, we assume that the objective decomposes into a model and a loss function, enabling us to intervene by modifying the model’s output before it passes to the loss function. Under this assumption, we design a method that clips the learned model output in tail events where it deviates too much from the output of a fixed reference model. Our method matches the optimal bounds for known-parameter SCO up to logarithmic factors in the uncertainty in the distance and Lipschitz parameters, thus efficiently adapting to large uncertainty in both.
APA
Kreisler, I., Carmon, Y. & Hinder, O.. (2026). Clipping the Price of Adaptivity at the Tail. Proceedings of Thirty Ninth Conference on Learning Theory, in Proceedings of Machine Learning Research 336:4267-4307 Available from https://proceedings.mlr.press/v336/kreisler26a.html.

Related Material