Online Assortment and Price Optimization Under Contextual Choice Models

Yigit Efe Erginbas, Thomas Courtade, Kannan Ramchandran
Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, PMLR 258:4456-4464, 2025.

Abstract

We consider an assortment selection and pricing problem in which a seller has $N$ different items available for sale. In each round, the seller observes a $d$-dimensional contextual preference information vector for the user, and offers to the user an assortment of $K$ items at prices chosen by the seller. The user selects at most one of the products from the offered assortment according to a multinomial logit choice model whose parameters are unknown. The seller observes which, if any, item is chosen at the end of each round, with the goal of maximizing cumulative revenue over a selling horizon of length $T$. For this problem, we propose an algorithm that learns from user feedback and achieves a revenue regret of order $\widetilde{\mathcal{O}}(d \sqrt{K T} / L_0 )$ where $L_0$ is the minimum price sensitivity parameter. We also obtain a lower bound of order $\Omega(d \sqrt{T}/ L_0)$ for the regret achievable by any algorithm.

Cite this Paper


BibTeX
@InProceedings{pmlr-v258-erginbas25a, title = {Online Assortment and Price Optimization Under Contextual Choice Models}, author = {Erginbas, Yigit Efe and Courtade, Thomas and Ramchandran, Kannan}, booktitle = {Proceedings of The 28th International Conference on Artificial Intelligence and Statistics}, pages = {4456--4464}, year = {2025}, editor = {Li, Yingzhen and Mandt, Stephan and Agrawal, Shipra and Khan, Emtiyaz}, volume = {258}, series = {Proceedings of Machine Learning Research}, month = {03--05 May}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v258/main/assets/erginbas25a/erginbas25a.pdf}, url = {https://proceedings.mlr.press/v258/erginbas25a.html}, abstract = {We consider an assortment selection and pricing problem in which a seller has $N$ different items available for sale. In each round, the seller observes a $d$-dimensional contextual preference information vector for the user, and offers to the user an assortment of $K$ items at prices chosen by the seller. The user selects at most one of the products from the offered assortment according to a multinomial logit choice model whose parameters are unknown. The seller observes which, if any, item is chosen at the end of each round, with the goal of maximizing cumulative revenue over a selling horizon of length $T$. For this problem, we propose an algorithm that learns from user feedback and achieves a revenue regret of order $\widetilde{\mathcal{O}}(d \sqrt{K T} / L_0 )$ where $L_0$ is the minimum price sensitivity parameter. We also obtain a lower bound of order $\Omega(d \sqrt{T}/ L_0)$ for the regret achievable by any algorithm.} }
Endnote
%0 Conference Paper %T Online Assortment and Price Optimization Under Contextual Choice Models %A Yigit Efe Erginbas %A Thomas Courtade %A Kannan Ramchandran %B Proceedings of The 28th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2025 %E Yingzhen Li %E Stephan Mandt %E Shipra Agrawal %E Emtiyaz Khan %F pmlr-v258-erginbas25a %I PMLR %P 4456--4464 %U https://proceedings.mlr.press/v258/erginbas25a.html %V 258 %X We consider an assortment selection and pricing problem in which a seller has $N$ different items available for sale. In each round, the seller observes a $d$-dimensional contextual preference information vector for the user, and offers to the user an assortment of $K$ items at prices chosen by the seller. The user selects at most one of the products from the offered assortment according to a multinomial logit choice model whose parameters are unknown. The seller observes which, if any, item is chosen at the end of each round, with the goal of maximizing cumulative revenue over a selling horizon of length $T$. For this problem, we propose an algorithm that learns from user feedback and achieves a revenue regret of order $\widetilde{\mathcal{O}}(d \sqrt{K T} / L_0 )$ where $L_0$ is the minimum price sensitivity parameter. We also obtain a lower bound of order $\Omega(d \sqrt{T}/ L_0)$ for the regret achievable by any algorithm.
APA
Erginbas, Y.E., Courtade, T. & Ramchandran, K.. (2025). Online Assortment and Price Optimization Under Contextual Choice Models. Proceedings of The 28th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 258:4456-4464 Available from https://proceedings.mlr.press/v258/erginbas25a.html.

Related Material