Logistic Regression: Tight Bounds for Stochastic and Online Optimization
Proceedings of The 27th Conference on Learning Theory, PMLR 35:197-209, 2014.
The logistic loss function is often advocated in machine learning and statistics as a smooth and strictly convex surrogate for the 0-1 loss. In this paper we investigate the question of whether these smoothness and convexity properties make the logistic loss preferable to other widely considered options such as the hinge loss. We show that in contrast to known asymptotic bounds, as long as the number of prediction/optimization iterations is sub exponential, the logistic loss provides no improvement over a generic non-smooth loss function such as the hinge loss. In particular we show that the convergence rate of stochastic logistic optimization is bounded from below by a polynomial in the diameter of the decision set and the number of prediction iterations, and provide a matching tight upper bound. This resolves the COLT open problem of McMahan and Streeter (2012).