A Study of Condition Numbers for First-Order Optimization

Charles Guille-Escuret; Manuela Girotti; Baptiste Goujaud; Ioannis Mitliagkas

A Study of Condition Numbers for First-Order Optimization

Charles Guille-Escuret, Manuela Girotti, Baptiste Goujaud, Ioannis Mitliagkas

Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, PMLR 130:1261-1269, 2021.

Abstract

In this work we introduce a new framework for the theoretical study of convergence and tuning of first-order optimization algorithms (FOA). The study of such algorithms typically requires assumptions on the objective functions: the most popular ones are probably smoothness and strong convexity. These metrics are used to tune the hyperparameters of FOA. We introduce a class of perturbations quantified via a new norm, called *-norm. We show that adding a small perturbation to the objective function has an equivalently small impact on the behavior of any FOA, which suggests that it should have a minor impact on the tuning of the algorithm. However, we show that smoothness and strong convexity can be heavily impacted by arbitrarily small perturbations, leading to excessively conservative tunings and convergence issues. In view of these observations, we propose a notion of continuity of the metrics, which is essential for a robust tuning strategy. Since smoothness and strong convexity are not continuous, we propose a comprehensive study of existing alternative metrics which we prove to be continuous. We describe their mutual relations and provide their guaranteed convergence rates for the Gradient Descent algorithm accordingly tuned. Finally we discuss how our work impacts the theoretical understanding of FOA and their performances.

Cite this Paper

BibTeX


@InProceedings{pmlr-v130-guille-escuret21a,
  title = 	 { A Study of Condition Numbers for First-Order Optimization },
  author =       {Guille-Escuret, Charles and Girotti, Manuela and Goujaud, Baptiste and Mitliagkas, Ioannis},
  booktitle = 	 {Proceedings of The 24th International Conference on Artificial Intelligence and Statistics},
  pages = 	 {1261--1269},
  year = 	 {2021},
  editor = 	 {Banerjee, Arindam and Fukumizu, Kenji},
  volume = 	 {130},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {13--15 Apr},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v130/guille-escuret21a/guille-escuret21a.pdf},
  url = 	 {https://proceedings.mlr.press/v130/guille-escuret21a.html},
  abstract = 	 { In this work we introduce a new framework for the theoretical study of convergence and tuning of first-order optimization algorithms (FOA). The study of such algorithms typically requires assumptions on the objective functions: the most popular ones are probably smoothness and strong convexity. These metrics are used to tune the hyperparameters of FOA. We introduce a class of perturbations quantified via a new norm, called *-norm. We show that adding a small perturbation to the objective function has an equivalently small impact on the behavior of any FOA, which suggests that it should have a minor impact on the tuning of the algorithm. However, we show that smoothness and strong convexity can be heavily impacted by arbitrarily small perturbations, leading to excessively conservative tunings and convergence issues. In view of these observations, we propose a notion of continuity of the metrics, which is essential for a robust tuning strategy. Since smoothness and strong convexity are not continuous, we propose a comprehensive study of existing alternative metrics which we prove to be continuous. We describe their mutual relations and provide their guaranteed convergence rates for the Gradient Descent algorithm accordingly tuned. Finally we discuss how our work impacts the theoretical understanding of FOA and their performances. }
}

Endnote

%0 Conference Paper
%T  A Study of Condition Numbers for First-Order Optimization 
%A Charles Guille-Escuret
%A Manuela Girotti
%A Baptiste Goujaud
%A Ioannis Mitliagkas
%B Proceedings of The 24th International Conference on Artificial Intelligence and Statistics
%C Proceedings of Machine Learning Research
%D 2021
%E Arindam Banerjee
%E Kenji Fukumizu	
%F pmlr-v130-guille-escuret21a
%I PMLR
%P 1261--1269
%U https://proceedings.mlr.press/v130/guille-escuret21a.html
%V 130
%X  In this work we introduce a new framework for the theoretical study of convergence and tuning of first-order optimization algorithms (FOA). The study of such algorithms typically requires assumptions on the objective functions: the most popular ones are probably smoothness and strong convexity. These metrics are used to tune the hyperparameters of FOA. We introduce a class of perturbations quantified via a new norm, called *-norm. We show that adding a small perturbation to the objective function has an equivalently small impact on the behavior of any FOA, which suggests that it should have a minor impact on the tuning of the algorithm. However, we show that smoothness and strong convexity can be heavily impacted by arbitrarily small perturbations, leading to excessively conservative tunings and convergence issues. In view of these observations, we propose a notion of continuity of the metrics, which is essential for a robust tuning strategy. Since smoothness and strong convexity are not continuous, we propose a comprehensive study of existing alternative metrics which we prove to be continuous. We describe their mutual relations and provide their guaranteed convergence rates for the Gradient Descent algorithm accordingly tuned. Finally we discuss how our work impacts the theoretical understanding of FOA and their performances.

APA


Guille-Escuret, C., Girotti, M., Goujaud, B. & Mitliagkas, I.. (2021).  A Study of Condition Numbers for First-Order Optimization . Proceedings of The 24th International Conference on Artificial Intelligence and Statistics, in Proceedings of Machine Learning Research 130:1261-1269 Available from https://proceedings.mlr.press/v130/guille-escuret21a.html.

A Study of Condition Numbers for First-Order Optimization

Abstract

Cite this Paper

Related Material