Efficient Learning with a Family of Nonconvex Regularizers by Redistributing Nonconvexity
Proceedings of The 33rd International Conference on Machine Learning, PMLR 48:2645-2654, 2016.
The use of convex regularizers allow for easy optimization, though they often produce biased estimation and inferior prediction performance. Recently, nonconvex regularizers have attracted a lot of attention and outperformed convex ones. However, the resultant optimization problem is much harder. In this paper, for a large class of nonconvex regularizers, we propose to move the nonconvexity from the regularizer to the loss. The nonconvex regularizer is then transformed to a familiar convex regularizer, while the resultant loss function can still be guaranteed to be smooth. Learning with the convexified regularizer can be performed by existing efficient algorithms originally designed for convex regularizers (such as the standard proximal algorithm and Frank-Wolfe algorithm). Moreover, it can be shown that critical points of the transformed problem are also critical points of the original problem. Extensive experiments on a number of nonconvex regularization problems show that the proposed procedure is much faster than the state-of-the-art nonconvex solvers.