DiMa: Understanding the Hardness of Online Matching Problems via Diffusion Models

Boyu Zhang, Aocheng Shen, Bing Liu, Qiankun Zhang, Bin Yuan, Jing Wang, Shenghao Liu, Xianjun Deng
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:75999-76022, 2025.

Abstract

We explore the potential of AI-enhanced combinatorial optimization theory, taking online bipartite matching (OBM) as a case study. In the theoretical study of OBM, the hardness corresponds to a performance upper bound of a specific online algorithm or any possible online algorithms. Typically, these upper bounds derive from challenging instances meticulously designed by theoretical computer scientists. Zhang et al. (ICML 2024) recently provide an example demonstrating how reinforcement learning techniques enhance the hardness result of a specific OBM model. Their attempt is inspiring but preliminary. It is unclear whether their methods can be applied to other OBM problems with similar breakthroughs. This paper takes a further step by introducing DiMa, a unified and novel framework that aims at understanding the hardness of OBM problems based on denoising diffusion probabilistic models (DDPMs). DiMa models the process of generating hard instances as denoising steps, and optimizes them by a novel reinforcement learning algorithm, named shortcut policy gradient (SPG). We first examine DiMa on the classic OBM problem by reproducing its known hardest input instance in literature. Further, we apply DiMa to two well-known variants of OBM, for which the exact hardness remains an open problem, and we successfully improve their theoretical state-of-the-art upper bounds.

Cite this Paper


BibTeX
@InProceedings{pmlr-v267-zhang25bp, title = {{D}i{M}a: Understanding the Hardness of Online Matching Problems via Diffusion Models}, author = {Zhang, Boyu and Shen, Aocheng and Liu, Bing and Zhang, Qiankun and Yuan, Bin and Wang, Jing and Liu, Shenghao and Deng, Xianjun}, booktitle = {Proceedings of the 42nd International Conference on Machine Learning}, pages = {75999--76022}, year = {2025}, editor = {Singh, Aarti and Fazel, Maryam and Hsu, Daniel and Lacoste-Julien, Simon and Berkenkamp, Felix and Maharaj, Tegan and Wagstaff, Kiri and Zhu, Jerry}, volume = {267}, series = {Proceedings of Machine Learning Research}, month = {13--19 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v267/main/assets/zhang25bp/zhang25bp.pdf}, url = {https://proceedings.mlr.press/v267/zhang25bp.html}, abstract = {We explore the potential of AI-enhanced combinatorial optimization theory, taking online bipartite matching (OBM) as a case study. In the theoretical study of OBM, the hardness corresponds to a performance upper bound of a specific online algorithm or any possible online algorithms. Typically, these upper bounds derive from challenging instances meticulously designed by theoretical computer scientists. Zhang et al. (ICML 2024) recently provide an example demonstrating how reinforcement learning techniques enhance the hardness result of a specific OBM model. Their attempt is inspiring but preliminary. It is unclear whether their methods can be applied to other OBM problems with similar breakthroughs. This paper takes a further step by introducing DiMa, a unified and novel framework that aims at understanding the hardness of OBM problems based on denoising diffusion probabilistic models (DDPMs). DiMa models the process of generating hard instances as denoising steps, and optimizes them by a novel reinforcement learning algorithm, named shortcut policy gradient (SPG). We first examine DiMa on the classic OBM problem by reproducing its known hardest input instance in literature. Further, we apply DiMa to two well-known variants of OBM, for which the exact hardness remains an open problem, and we successfully improve their theoretical state-of-the-art upper bounds.} }
Endnote
%0 Conference Paper %T DiMa: Understanding the Hardness of Online Matching Problems via Diffusion Models %A Boyu Zhang %A Aocheng Shen %A Bing Liu %A Qiankun Zhang %A Bin Yuan %A Jing Wang %A Shenghao Liu %A Xianjun Deng %B Proceedings of the 42nd International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2025 %E Aarti Singh %E Maryam Fazel %E Daniel Hsu %E Simon Lacoste-Julien %E Felix Berkenkamp %E Tegan Maharaj %E Kiri Wagstaff %E Jerry Zhu %F pmlr-v267-zhang25bp %I PMLR %P 75999--76022 %U https://proceedings.mlr.press/v267/zhang25bp.html %V 267 %X We explore the potential of AI-enhanced combinatorial optimization theory, taking online bipartite matching (OBM) as a case study. In the theoretical study of OBM, the hardness corresponds to a performance upper bound of a specific online algorithm or any possible online algorithms. Typically, these upper bounds derive from challenging instances meticulously designed by theoretical computer scientists. Zhang et al. (ICML 2024) recently provide an example demonstrating how reinforcement learning techniques enhance the hardness result of a specific OBM model. Their attempt is inspiring but preliminary. It is unclear whether their methods can be applied to other OBM problems with similar breakthroughs. This paper takes a further step by introducing DiMa, a unified and novel framework that aims at understanding the hardness of OBM problems based on denoising diffusion probabilistic models (DDPMs). DiMa models the process of generating hard instances as denoising steps, and optimizes them by a novel reinforcement learning algorithm, named shortcut policy gradient (SPG). We first examine DiMa on the classic OBM problem by reproducing its known hardest input instance in literature. Further, we apply DiMa to two well-known variants of OBM, for which the exact hardness remains an open problem, and we successfully improve their theoretical state-of-the-art upper bounds.
APA
Zhang, B., Shen, A., Liu, B., Zhang, Q., Yuan, B., Wang, J., Liu, S. & Deng, X.. (2025). DiMa: Understanding the Hardness of Online Matching Problems via Diffusion Models. Proceedings of the 42nd International Conference on Machine Learning, in Proceedings of Machine Learning Research 267:75999-76022 Available from https://proceedings.mlr.press/v267/zhang25bp.html.

Related Material