On the Oracle Complexity of Higher-Order Smooth Non-Convex Finite-Sum Optimization

Nicolas Emmenegger, Rasmus Kyng, Ahad N. Zehmakan
Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, PMLR 151:10718-10752, 2022.

Abstract

We prove lower bounds for higher-order methods in smooth non-convex finite-sum optimization. Our contribution is threefold: We first show that a deterministic algorithm cannot profit from the finite-sum structure of the objective and that simulating a pth-order regularized method on the whole function by constructing exact gradient information is optimal up to constant factors. We further show lower bounds for randomized algorithms and compare them with the best-known upper bounds. To address some gaps between the bounds, we propose a new second-order smoothness assumption that can be seen as an analogue of the first-order mean-squared smoothness assumption. We prove that it is sufficient to ensure state-of-the-art convergence guarantees while allowing for a sharper lower bound.

Cite this Paper


BibTeX
@InProceedings{pmlr-v151-emmenegger22a, title = { On the Oracle Complexity of Higher-Order Smooth Non-Convex Finite-Sum Optimization }, author = {Emmenegger, Nicolas and Kyng, Rasmus and Zehmakan, Ahad N.}, booktitle = {Proceedings of The 25th International Conference on Artificial Intelligence and Statistics}, pages = {10718--10752}, year = {2022}, editor = {Camps-Valls, Gustau and Ruiz, Francisco J. R. and Valera, Isabel}, volume = {151}, series = {Proceedings of Machine Learning Research}, month = {28--30 Mar}, publisher = {PMLR}, pdf = {https://proceedings.mlr.press/v151/emmenegger22a/emmenegger22a.pdf}, url = {https://proceedings.mlr.press/v151/emmenegger22a.html}, abstract = { We prove lower bounds for higher-order methods in smooth non-convex finite-sum optimization. Our contribution is threefold: We first show that a deterministic algorithm cannot profit from the finite-sum structure of the objective and that simulating a pth-order regularized method on the whole function by constructing exact gradient information is optimal up to constant factors. We further show lower bounds for randomized algorithms and compare them with the best-known upper bounds. To address some gaps between the bounds, we propose a new second-order smoothness assumption that can be seen as an analogue of the first-order mean-squared smoothness assumption. We prove that it is sufficient to ensure state-of-the-art convergence guarantees while allowing for a sharper lower bound. } }
Endnote
%0 Conference Paper %T On the Oracle Complexity of Higher-Order Smooth Non-Convex Finite-Sum Optimization %A Nicolas Emmenegger %A Rasmus Kyng %A Ahad N. Zehmakan %B Proceedings of The 25th International Conference on Artificial Intelligence and Statistics %C Proceedings of Machine Learning Research %D 2022 %E Gustau Camps-Valls %E Francisco J. R. Ruiz %E Isabel Valera %F pmlr-v151-emmenegger22a %I PMLR %P 10718--10752 %U https://proceedings.mlr.press/v151/emmenegger22a.html %V 151 %X We prove lower bounds for higher-order methods in smooth non-convex finite-sum optimization. Our contribution is threefold: We first show that a deterministic algorithm cannot profit from the finite-sum structure of the objective and that simulating a pth-order regularized method on the whole function by constructing exact gradient information is optimal up to constant factors. We further show lower bounds for randomized algorithms and compare them with the best-known upper bounds. To address some gaps between the bounds, we propose a new second-order smoothness assumption that can be seen as an analogue of the first-order mean-squared smoothness assumption. We prove that it is sufficient to ensure state-of-the-art convergence guarantees while allowing for a sharper lower bound.
APA
Emmenegger, N., Kyng, R. & Zehmakan, A.N.. (2022). On the Oracle Complexity of Higher-Order Smooth Non-Convex Finite-Sum Optimization . Proceedings of The 25th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 151:10718-10752 Available from https://proceedings.mlr.press/v151/emmenegger22a.html.

Related Material