Beyond LeastSquares: Fast Rates for Regularized Empirical Risk Minimization through SelfConcordance
[edit]
Proceedings of the ThirtySecond Conference on Learning Theory, PMLR 99:22942340, 2019.
Abstract
We consider learning methods based on the regularization of a convex empirical risk by a squared Hilbertian norm, a setting that includes linear predictors and nonlinear predictors through positivedefinite kernels. In order to go beyond the generic analysis leading to convergence rates of the excess risk as $O(1/\sqrt{n})$ from $n$ observations, we assume that the individual losses are selfconcordant, that is, their thirdorder derivatives are bounded by their secondorder derivatives. This setting includes leastsquares, as well as all generalized linear models such as logistic and softmax regression. For this class of losses, we provide a biasvariance decomposition and show that the assumptions commonly made in leastsquares regression, such as the source and capacity conditions, can be adapted to obtain fast nonasymptotic rates of convergence by improving the bias terms, the variance terms or both.
Related Material


