Computing D-Stationary Points of $ρ$-Margin Loss SVM
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:3772-3793, 2022.
This paper is concerned with the algorithmic aspects of sharper stationarity of a nonconvex, nonsmooth, Clarke irregular machine learning model. We study the SVM problem with a $\rho$-margin loss function, which is the margin theory generalization bound of SVM introduced in the learning theory textbook by Mohri et al. , and has been extensively studied in operations research, statistics, and machine learning communities. However, due to its nonconvex, nonsmooth, and irregular nature, none of the existing optimization methods can efficiently compute a d(irectional)-stationary point, which turns out to be also a local minimum, for the $\rho$-margin loss SVM problem. After a detailed discussion of various nonsmooth stationarity notions, we propose a highly efficient nonconvex semi-proximal ADMM-based scheme that provably computes d-stationary points and enjoys a local linear convergence rate. We report concrete examples to demonstrate the necessity of our assumptions. Numerical results verify the effectiveness of the new algorithm and complement our theoretical results.