Towards Agnostic Feature-based Dynamic Pricing: Linear Policies vs Linear Valuation with Unknown Noise

Jianyu Xu, Yu-Xiang Wang
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:9643-9662, 2022.

Abstract

In feature-based dynamic pricing, a seller sets appropriate prices for a sequence of products (described by feature vectors) on the fly by learning from the binary outcomes of previous sales sessions ("Sold" if valuation price, and "Not Sold" otherwise). Existing works either assume noiseless linear valuation or precisely-known noise distribution, which limits the applicability of those algorithms in practice when these assumptions are hard to verify. In this work, we study two more agnostic models: (a) a "linear policy" problem where we aim at competing with the best linear pricing policy while making no assumptions on the data, and (b) a "linear noisy valuation" problem where the random valuation is linear plus an unknown and assumption-free noise. For the former model, we show a Θ(d1/3T2/3) minimax regret up to logarithmic factors. For the latter model, we present an algorithm that achieves an O(T3/4) regret and improve the best-known lower bound from Omega(T3/5) to Ω(T2/3). These results demonstrate that no-regret learning is possible for feature-based dynamic pricing under weak assumptions, but also reveal a disappointing fact that the seemingly richer pricing feedback is not significantly more useful than the bandit-feedback in regret reduction.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-xu22d, title = { Towards Agnostic Feature-based Dynamic Pricing: Linear Policies vs Linear Valuation with Unknown Noise }, author = {Xu, Jianyu and Wang, Yu-Xiang}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {9643--9662}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/xu22d/xu22d.pdf}, url = {https://proceedings.mlr.press/v151/xu22d.html}, abstract = { In feature-based dynamic pricing, a seller sets appropriate prices for a sequence of products (described by feature vectors) on the fly by learning from the binary outcomes of previous sales sessions ("Sold" if valuation $\geq$ price, and "Not Sold" otherwise). Existing works either assume noiseless linear valuation or precisely-known noise distribution, which limits the applicability of those algorithms in practice when these assumptions are hard to verify. In this work, we study two more agnostic models: (a) a "linear policy" problem where we aim at competing with the best linear pricing policy while making no assumptions on the data, and (b) a "linear noisy valuation" problem where the random valuation is linear plus an unknown and assumption-free noise. For the former model, we show a $\Theta(d^{1/3}T^{2/3})$ minimax regret up to logarithmic factors. For the latter model, we present an algorithm that achieves an $O(T^{3/4})$ regret and improve the best-known lower bound from $Omega(T^{3/5})$ to $\Omega(T^{2/3})$. These results demonstrate that no-regret learning is possible for feature-based dynamic pricing under weak assumptions, but also reveal a disappointing fact that the seemingly richer pricing feedback is not significantly more useful than the bandit-feedback in regret reduction. } }
Endnote
%0 Conference Paper %T Towards Agnostic Feature-based Dynamic Pricing: Linear Policies vs Linear Valuation with Unknown Noise %A Jianyu Xu %A Yu-Xiang Wang %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-xu22d %I PMLR %P 9643--9662 %U https://proceedings.mlr.press/v151/xu22d.html %V 151 %X In feature-based dynamic pricing, a seller sets appropriate prices for a sequence of products (described by feature vectors) on the fly by learning from the binary outcomes of previous sales sessions ("Sold" if valuation $\geq$ price, and "Not Sold" otherwise). Existing works either assume noiseless linear valuation or precisely-known noise distribution, which limits the applicability of those algorithms in practice when these assumptions are hard to verify. In this work, we study two more agnostic models: (a) a "linear policy" problem where we aim at competing with the best linear pricing policy while making no assumptions on the data, and (b) a "linear noisy valuation" problem where the random valuation is linear plus an unknown and assumption-free noise. For the former model, we show a $\Theta(d^{1/3}T^{2/3})$ minimax regret up to logarithmic factors. For the latter model, we present an algorithm that achieves an $O(T^{3/4})$ regret and improve the best-known lower bound from $Omega(T^{3/5})$ to $\Omega(T^{2/3})$. These results demonstrate that no-regret learning is possible for feature-based dynamic pricing under weak assumptions, but also reveal a disappointing fact that the seemingly richer pricing feedback is not significantly more useful than the bandit-feedback in regret reduction.
APA
Xu, J. & Wang, Y.. (2022). Towards Agnostic Feature-based Dynamic Pricing: Linear Policies vs Linear Valuation with Unknown Noise . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:9643-9662 Available from https://proceedings.mlr.press/v151/xu22d.html.

Related Material