[edit]
DiMa: Understanding the Hardness of Online Matching Problems via Diffusion Models
Proceedings of the 42nd International Conference on Machine Learning, PMLR 267:75999-76022, 2025.
Abstract
We explore the potential of AI-enhanced combinatorial optimization theory, taking online bipartite matching (OBM) as a case study. In the theoretical study of OBM, the hardness corresponds to a performance upper bound of a specific online algorithm or any possible online algorithms. Typically, these upper bounds derive from challenging instances meticulously designed by theoretical computer scientists. Zhang et al. (ICML 2024) recently provide an example demonstrating how reinforcement learning techniques enhance the hardness result of a specific OBM model. Their attempt is inspiring but preliminary. It is unclear whether their methods can be applied to other OBM problems with similar breakthroughs. This paper takes a further step by introducing DiMa, a unified and novel framework that aims at understanding the hardness of OBM problems based on denoising diffusion probabilistic models (DDPMs). DiMa models the process of generating hard instances as denoising steps, and optimizes them by a novel reinforcement learning algorithm, named shortcut policy gradient (SPG). We first examine DiMa on the classic OBM problem by reproducing its known hardest input instance in literature. Further, we apply DiMa to two well-known variants of OBM, for which the exact hardness remains an open problem, and we successfully improve their theoretical state-of-the-art upper bounds.