[edit]
An Optimal High-Order Tensor Method for Convex Optimization
Proceedings of the Thirty-Second Conference on Learning Theory, PMLR 99:1799-1801, 2019.
Abstract
This paper is concerned with finding an optimal algorithm for minimizing a composite convex objective function. The basic setting is that the objective is the sum of two convex functions: the first function is smooth with up to the d-th order derivative information available, and the second function is possibly non-smooth, but its proximal tensor mappings can be computed approximately in an efficient manner. The problem is to find – in that setting – the best possible (optimal) iteration complexity for convex optimization. Along that line, for the smooth case (without the second non-smooth part in the objective), Nesterov (1983) proposed an optimal algorithm for the first-order methods (d=1) with iteration complexity O( 1 / k^2 ). A high-order tensor algorithm with iteration complexity of O( 1 / k^{d+1} ) was proposed by Baes (2009) and Nesterov (2018). In this paper, we propose a new high-order tensor algorithm for the general composite case, with the iteration complexity of O( 1 / k^{(3d+1)/2} ), which matches the lower bound for the d-th order methods as established in Nesterov (2018) and Shamir et al. (2018), and hence is optimal. Our approach is based on the Accelerated Hybrid Proximal Extragradient (A-HPE) framework proposed in Monteiro and Svaiter (2013), where a bisection procedure is installed for each A-HPE iteration. At each bisection step a proximal tensor subproblem is approximately solved, and the total number of bisection steps per A-HPE iteration is bounded by a logarithmic factor in the precision required.