Competing Bandits in Matching Markets via Super Stability

Soumya Basu

Competing Bandits in Matching Markets via Super Stability

Soumya Basu

Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:3226-3250, 2025.

Abstract

We study bandit learning in matching markets with two-sided reward uncertainty, extending prior research primarily focused on single-sided uncertainty. Leveraging the concept of ‘super-stability’ from Irving (1994), we demonstrate the advantage of the Extended Gale-Shapley (GS) algorithm over the standard GS algorithm in achieving true stable matchings under incomplete information. By employing the Extended GS algorithm, our centralized algorithm attains a logarithmic pessimal stable regret dependent on an instance-dependent admissible gap parameter. This algorithm is further adapted to a decentralized setting with a constant regret increase. Finally, we establish a novel centralized instance-dependent lower bound for binary stable regret, elucidating the roles of the admissible gap and super-stable matching in characterizing the complexity of stable matching with bandit feedback.

Cite this Paper

BibTeX

@InProceedings{pmlr-v267-basu25a,
  title = 	 {Competing Bandits in Matching Markets via Super Stability},
  author =       {Basu, Soumya},
  booktitle = 	 {Proceedings of the 42nd International Conference on Machine Learning},
  pages = 	 {3226--3250},
  year = 	 {2025},
  editor = 	 {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry},
  volume = 	 {267},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--19 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v267/main/assets/basu25a/basu25a.pdf},
  url = 	 {https://proceedings.mlr.press/v267/basu25a.html},
  abstract = 	 {We study bandit learning in matching markets with two-sided reward uncertainty, extending prior research primarily focused on single-sided uncertainty. Leveraging the concept of ‘super-stability’ from Irving (1994), we demonstrate the advantage of the Extended Gale-Shapley (GS) algorithm over the standard GS algorithm in achieving true stable matchings under incomplete information. By employing the Extended GS algorithm, our centralized algorithm attains a logarithmic pessimal stable regret dependent on an instance-dependent admissible gap parameter. This algorithm is further adapted to a decentralized setting with a constant regret increase. Finally, we establish a novel centralized instance-dependent lower bound for binary stable regret, elucidating the roles of the admissible gap and super-stable matching in characterizing the complexity of stable matching with bandit feedback.}
}

Endnote

%0 Conference Paper
%T Competing Bandits in Matching Markets via Super Stability
%A Soumya Basu
%B Proceedings of the 42nd International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2025
%E Aarti Singh
%E Maryam Fazel
%E Daniel Hsu
%E Simon Lacoste-Julien
%E Felix Berkenkamp
%E Tegan Maharaj
%E Kiri Wagstaff
%E Jerry Zhu	
%F pmlr-v267-basu25a
%I PMLR
%P 3226--3250
%U https://proceedings.mlr.press/v267/basu25a.html
%V 267
%X We study bandit learning in matching markets with two-sided reward uncertainty, extending prior research primarily focused on single-sided uncertainty. Leveraging the concept of ‘super-stability’ from Irving (1994), we demonstrate the advantage of the Extended Gale-Shapley (GS) algorithm over the standard GS algorithm in achieving true stable matchings under incomplete information. By employing the Extended GS algorithm, our centralized algorithm attains a logarithmic pessimal stable regret dependent on an instance-dependent admissible gap parameter. This algorithm is further adapted to a decentralized setting with a constant regret increase. Finally, we establish a novel centralized instance-dependent lower bound for binary stable regret, elucidating the roles of the admissible gap and super-stable matching in characterizing the complexity of stable matching with bandit feedback.

APA

Basu, S.. (2025). Competing Bandits in Matching Markets via Super Stability. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:3226-3250 Available from https://proceedings.mlr.press/v267/basu25a.html.

Competing Bandits in Matching Markets via Super Stability

Abstract

Cite this Paper

Related Material