Strategy-robust Online Learning in Contextual Pricing

Joon Suk Huh, Kirthevasan Kandasamy
Proceedings of The 37th International Conference on Algorithmic Learning Theory, PMLR 313:1-32, 2026.

Abstract

Learning effective pricing strategies is crucial in digital marketplaces, especially when buyers’ valuations are unknown and must be inferred through interaction. We study the online contextual pricing problem, where a seller observes a stream of context–valuation pairs and dynamically sets prices. Moreover, departing from traditional online learning frameworks, we consider a strategic setting in which buyers may misreport valuations to influence future prices—a challenge known as strategic overfitting (Amin et al., 2013). We introduce a strategy-robust notion of regret for multi-buyer online environments, capturing worst-case strategic behavior in the spirit of the Price of Anarchy. Our first contribution is a polynomial-time approximation scheme (PTAS) for learning linear pricing policies in adversarial, adaptive environments, enabled by a novel online sketching technique. Building on this result, we propose our main construction: the Sparse Update Mechanism (SUM), a simple yet effective sequential mechanism that ensures robustness to all Nash equilibria among buyers. Moreover, our construction yields a black-box reduction from online expert algorithms to strategy-robust learners.

Cite this Paper


BibTeX
@InProceedings{pmlr-v313-huh26a, title = {Strategy-robust Online Learning in Contextual Pricing}, author = {Huh, Joon Suk and Kandasamy, Kirthevasan}, booktitle = {Proceedings of The 37th International Conference on Algorithmic Learning Theory}, pages = {1--32}, year = {2026}, editor = {Telgarsky, Matus and Ullman, Jonathan}, volume = {313}, series = {Proceedings of Machine Learning Research}, month = {23--26 Feb}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v313/main/assets/huh26a/huh26a.pdf}, url = {https://proceedings.mlr.press/v313/huh26a.html}, abstract = {Learning effective pricing strategies is crucial in digital marketplaces, especially when buyers’ valuations are unknown and must be inferred through interaction. We study the online contextual pricing problem, where a seller observes a stream of context–valuation pairs and dynamically sets prices. Moreover, departing from traditional online learning frameworks, we consider a strategic setting in which buyers may misreport valuations to influence future prices—a challenge known as strategic overfitting (Amin et al., 2013). We introduce a strategy-robust notion of regret for multi-buyer online environments, capturing worst-case strategic behavior in the spirit of the Price of Anarchy. Our first contribution is a polynomial-time approximation scheme (PTAS) for learning linear pricing policies in adversarial, adaptive environments, enabled by a novel online sketching technique. Building on this result, we propose our main construction: the Sparse Update Mechanism (SUM), a simple yet effective sequential mechanism that ensures robustness to all Nash equilibria among buyers. Moreover, our construction yields a black-box reduction from online expert algorithms to strategy-robust learners.} }
Endnote
%0 Conference Paper %T Strategy-robust Online Learning in Contextual Pricing %A Joon Suk Huh %A Kirthevasan Kandasamy %B Proceedings of The 37th International Conference on Algorithmic Learning Theory %C Proceedings of Machine Learning Research %D 2026 %E Matus Telgarsky %E Jonathan Ullman %F pmlr-v313-huh26a %I PMLR %P 1--32 %U https://proceedings.mlr.press/v313/huh26a.html %V 313 %X Learning effective pricing strategies is crucial in digital marketplaces, especially when buyers’ valuations are unknown and must be inferred through interaction. We study the online contextual pricing problem, where a seller observes a stream of context–valuation pairs and dynamically sets prices. Moreover, departing from traditional online learning frameworks, we consider a strategic setting in which buyers may misreport valuations to influence future prices—a challenge known as strategic overfitting (Amin et al., 2013). We introduce a strategy-robust notion of regret for multi-buyer online environments, capturing worst-case strategic behavior in the spirit of the Price of Anarchy. Our first contribution is a polynomial-time approximation scheme (PTAS) for learning linear pricing policies in adversarial, adaptive environments, enabled by a novel online sketching technique. Building on this result, we propose our main construction: the Sparse Update Mechanism (SUM), a simple yet effective sequential mechanism that ensures robustness to all Nash equilibria among buyers. Moreover, our construction yields a black-box reduction from online expert algorithms to strategy-robust learners.
APA
Huh, J.S. & Kandasamy, K.. (2026). Strategy-robust Online Learning in Contextual Pricing. Proceedings of The 37th International Conference on Algorithmic Learning Theory, in Proceedings of Machine Learning Research 313:1-32 Available from https://proceedings.mlr.press/v313/huh26a.html.

Related Material