Nonconvex Conditional Gradient Sliding
[edit]
Proceedings of the 35th International Conference on Machine Learning, PMLR 80:42084217, 2018.
Abstract
We investigate a projection free optimization method, namely nonconvex conditional gradient sliding (NCGS) for nonconvex optimization problems on the batch, stochastic and finitesum settings. Conditional gradient sliding (CGS) method, by integrating Nesterov’s accelerated gradient method with FrankWolfe (FW) method in a smart way, outperforms FW for convex optimization, by reducing the amount of gradient computations. However, the study of CGS in the nonconvex setting is limited. In this paper, we propose the nonconvex conditional gradient sliding (NCGS) methods and analyze their convergence properties. We also leverage the idea of variance reduction from the recent progress in convex optimization to obtain a new algorithm termed variance reduced NCGS (NCGSVR), and obtain faster convergence rate than the batch NCGS in the finitesum setting. We show that NCGS algorithms outperform their FrankWolfe counterparts both in theory and in practice, for all three settings, namely the batch, stochastic and finitesum setting. This significantly improves our understanding of optimizing nonconvex functions with complicated feasible sets (where projection is prohibitively expensive).
Related Material


