[edit]
A Dynamic Penalization Framework for Online Rank-1 Semidefinite Programming Relaxations
Proceedings of the 7th Annual Learning for Dynamics \& Control Conference, PMLR 283:1012-1024, 2025.
Abstract
We propose a dynamic penalization framework for recovering rank-1 solutions in sequential semidefinite programming (SDP) relaxations. Obtaining rank-1 solutions–crucial for recovering physically meaningful solutions in many applications–becomes particularly challenging in dynamic environments where problem parameters continuously evolve. Our framework operates in two interconnected phases: the learning phase dynamically adjusts penalty parameters to enforce rank-1 feasibility based on feedback from the decision phase, while the decision phase solves the resulting penalized SDP relaxations using the penalty parameters specified by the learning phase. To accelerate rank-1 recovery across sequential problems, we introduce a meta-learning model that provides informed initializations for the penalty matrices. The meta-learning model leverages historical data from previously solved tasks, eliminating the need for externally curated datasets. By using task-specific features and updates from prior iterations, the meta-model intelligently initializes penalty parameters, reducing the number of iterations required between the two phases. We prove sublinear convergence to rank-1 solutions and establish low dynamic regret bounds that improve with task similarity. Empirical results on real-world rank-constrained applications, including the Max-Cut problem and Optimal Power Flow (OPF), demonstrate that our method consistently recovers rank-1 solutions.