FLAG n’ FLARE: Fast Linearly-Coupled Adaptive Gradient Methods

[edit]

Xiang Cheng, Fred Roosta, Stefan Palombo, Peter Bartlett, Michael Mahoney ;
Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, PMLR 84:404-414, 2018.

Abstract

We consider first order gradient methods for effectively optimizing a composite objective in the form of a sum of smooth and, potentially, non-smooth functions. We present accelerated and adaptive gradient methods, called FLAG and FLARE, which can offer the best of both worlds. They can achieve the optimal convergence rate by attaining the optimal first-order oracle complexity for smooth convex optimization. Additionally, they can adaptively and non-uniformly re-scale the gradient direction to adapt to the limited curvature available and conform to the geometry of the domain. We show theoretically and empirically that, through the compounding effects of acceleration and adaptivity, FLAG and FLARE can be highly effective for many data fitting and machine learning applications.

Related Material