[edit]

# One-sided Frank-Wolfe algorithms for saddle problems

*Proceedings of the 38th International Conference on Machine Learning*, PMLR 139:5665-5675, 2021.

#### Abstract

We study a class of convex-concave saddle-point problems of the form $\min_x\max_y ⟨Kx,y⟩+f_{\cal P}(x)-h^*(y)$ where $K$ is a linear operator, $f_{\cal P}$ is the sum of a convex function $f$ with a Lipschitz-continuous gradient and the indicator function of a bounded convex polytope ${\cal P}$, and $h^\ast$ is a convex (possibly nonsmooth) function. Such problem arises, for example, as a Lagrangian relaxation of various discrete optimization problems. Our main assumptions are the existence of an efficient {\em linear minimization oracle} ($lmo$) for $f_{\cal P}$ and an efficient {\em proximal map} ($prox$) for $h^*$ which motivate the solution via a blend of proximal primal-dual algorithms and Frank-Wolfe algorithms. In case $h^*$ is the indicator function of a linear constraint and function $f$ is quadratic, we show a $O(1/n^2)$ convergence rate on the dual objective, requiring $O(n \log n)$ calls of $lmo$. If the problem comes from the constrained optimization problem $\min_{x\in\mathbb R^d}\{f_{\cal P}(x)\:|\:Ax-b=0\}$ then we additionally get bound $O(1/n^2)$ both on the primal gap and on the infeasibility gap. In the most general case, we show a $O(1/n)$ convergence rate of the primal-dual gap again requiring $O(n\log n)$ calls of $lmo$. To the best of our knowledge, this improves on the known convergence rates for the considered class of saddle-point problems. We show applications to labeling problems frequently appearing in machine learning and computer vision.