Oracle Complexity of Second-Order Methods for Finite-Sum Problems

Yossi Arjevani; Ohad Shamir

Oracle Complexity of Second-Order Methods for Finite-Sum Problems

Yossi Arjevani, Ohad Shamir

Proceedings of the 34th International Conference on Machine Learning, PMLR 70:205-213, 2017.

Abstract

Finite-sum optimization problems are ubiquitous in machine learning, and are commonly solved using first-order methods which rely on gradient computations. Recently, there has been growing interest in second-order methods, which rely on both gradients and Hessians. In principle, second-order methods can require much fewer iterations than first-order methods, and hold the promise for more efficient algorithms. Although computing and manipulating Hessians is prohibitive for high-dimensional problems in general, the Hessians of individual functions in finite-sum problems can often be efficiently computed, e.g. because they possess a low-rank structure. Can second-order information indeed be used to solve such problems more efficiently? In this paper, we provide evidence that the answer – perhaps surprisingly – is negative, at least in terms of worst-case guarantees. However, we also discuss what additional assumptions and algorithmic approaches might potentially circumvent this negative result.

Cite this Paper

BibTeX


@InProceedings{pmlr-v70-arjevani17a,
  title = 	 {Oracle Complexity of Second-Order Methods for Finite-Sum Problems},
  author =       {Yossi Arjevani and Ohad Shamir},
  booktitle = 	 {Proceedings of the 34th International Conference on Machine Learning},
  pages = 	 {205--213},
  year = 	 {2017},
  editor = 	 {Precup, Doina and Teh, Yee Whye},
  volume = 	 {70},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {06--11 Aug},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v70/arjevani17a/arjevani17a.pdf},
  url = 	 {https://proceedings.mlr.press/v70/arjevani17a.html},
  abstract = 	 {Finite-sum optimization problems are ubiquitous in machine learning, and are commonly solved using first-order methods which rely on gradient computations. Recently, there has been growing interest in second-order methods, which rely on both gradients and Hessians. In principle, second-order methods can require much fewer iterations than first-order methods, and hold the promise for more efficient algorithms. Although computing and manipulating Hessians is prohibitive for high-dimensional problems in general, the Hessians of individual functions in finite-sum problems can often be efficiently computed, e.g. because they possess a low-rank structure. Can second-order information indeed be used to solve such problems more efficiently? In this paper, we provide evidence that the answer – perhaps surprisingly – is negative, at least in terms of worst-case guarantees. However, we also discuss what additional assumptions and algorithmic approaches might potentially circumvent this negative result.}
}

Endnote

%0 Conference Paper
%T Oracle Complexity of Second-Order Methods for Finite-Sum Problems
%A Yossi Arjevani
%A Ohad Shamir
%B Proceedings of the 34th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2017
%E Doina Precup
%E Yee Whye Teh	
%F pmlr-v70-arjevani17a
%I PMLR
%P 205--213
%U https://proceedings.mlr.press/v70/arjevani17a.html
%V 70
%X Finite-sum optimization problems are ubiquitous in machine learning, and are commonly solved using first-order methods which rely on gradient computations. Recently, there has been growing interest in second-order methods, which rely on both gradients and Hessians. In principle, second-order methods can require much fewer iterations than first-order methods, and hold the promise for more efficient algorithms. Although computing and manipulating Hessians is prohibitive for high-dimensional problems in general, the Hessians of individual functions in finite-sum problems can often be efficiently computed, e.g. because they possess a low-rank structure. Can second-order information indeed be used to solve such problems more efficiently? In this paper, we provide evidence that the answer – perhaps surprisingly – is negative, at least in terms of worst-case guarantees. However, we also discuss what additional assumptions and algorithmic approaches might potentially circumvent this negative result.

APA


Arjevani, Y. & Shamir, O.. (2017). Oracle Complexity of Second-Order Methods for Finite-Sum Problems. Proceedings of the 34th International Conference on Machine Learning, in Proceedings of Machine Learning Research 70:205-213 Available from https://proceedings.mlr.press/v70/arjevani17a.html.

Oracle Complexity of Second-Order Methods for Finite-Sum Problems

Abstract

Cite this Paper

Related Material