A Study of First-Order Methods with a Deterministic Relative-Error Gradient Oracle

Nadav Hallak, Kfir Yehuda Levy
Proceedings of the 41st International Conference on Machine Learning, PMLR 235:17313-17332, 2024.

Abstract

This paper studies the theoretical guarantees of the classical projected gradient and conditional gradient methods applied to constrained optimization problems with biased relative-error gradient oracles. These oracles are used in various settings, such as distributed optimization systems or derivative-free optimization, and are particularly common when gradients are compressed, quantized, or estimated via finite differences computations. Several settings are investigated: Optimization over the box with a coordinate-wise erroneous gradient oracle, optimization over a general compact convex set, and three more specific scenarios. Convergence guarantees are established with respect to the relative-error magnitude, and in particular, we show that the conditional gradient is invariant to relative-error when applied over the box with a coordinate-wise erroneous gradient oracle, and the projected gradient maintains its convergence guarantees when optimizing a nonconvex objective function.

Cite this Paper


BibTeX
@InProceedings{pmlr-v235-hallak24a, title = {A Study of First-Order Methods with a Deterministic Relative-Error Gradient Oracle}, author = {Hallak, Nadav and Levy, Kfir Yehuda}, booktitle = {Proceedings of the 41st International Conference on Machine Learning}, pages = {17313--17332}, year = {2024}, editor = {Salakhutdinov, Ruslan and Kolter, Zico and Heller, Katherine and Weller, Adrian and Oliver, Nuria and Scarlett, Jonathan and Berkenkamp, Felix}, volume = {235}, series = {Proceedings of Machine Learning Research}, month = {21--27 Jul}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v235/main/assets/hallak24a/hallak24a.pdf}, url = {https://proceedings.mlr.press/v235/hallak24a.html}, abstract = {This paper studies the theoretical guarantees of the classical projected gradient and conditional gradient methods applied to constrained optimization problems with biased relative-error gradient oracles. These oracles are used in various settings, such as distributed optimization systems or derivative-free optimization, and are particularly common when gradients are compressed, quantized, or estimated via finite differences computations. Several settings are investigated: Optimization over the box with a coordinate-wise erroneous gradient oracle, optimization over a general compact convex set, and three more specific scenarios. Convergence guarantees are established with respect to the relative-error magnitude, and in particular, we show that the conditional gradient is invariant to relative-error when applied over the box with a coordinate-wise erroneous gradient oracle, and the projected gradient maintains its convergence guarantees when optimizing a nonconvex objective function.} }
Endnote
%0 Conference Paper %T A Study of First-Order Methods with a Deterministic Relative-Error Gradient Oracle %A Nadav Hallak %A Kfir Yehuda Levy %B Proceedings of the 41st International Conference on Machine Learning %C Proceedings of Machine Learning Research %D 2024 %E Ruslan Salakhutdinov %E Zico Kolter %E Katherine Heller %E Adrian Weller %E Nuria Oliver %E Jonathan Scarlett %E Felix Berkenkamp %F pmlr-v235-hallak24a %I PMLR %P 17313--17332 %U https://proceedings.mlr.press/v235/hallak24a.html %V 235 %X This paper studies the theoretical guarantees of the classical projected gradient and conditional gradient methods applied to constrained optimization problems with biased relative-error gradient oracles. These oracles are used in various settings, such as distributed optimization systems or derivative-free optimization, and are particularly common when gradients are compressed, quantized, or estimated via finite differences computations. Several settings are investigated: Optimization over the box with a coordinate-wise erroneous gradient oracle, optimization over a general compact convex set, and three more specific scenarios. Convergence guarantees are established with respect to the relative-error magnitude, and in particular, we show that the conditional gradient is invariant to relative-error when applied over the box with a coordinate-wise erroneous gradient oracle, and the projected gradient maintains its convergence guarantees when optimizing a nonconvex objective function.
APA
Hallak, N. & Levy, K.Y.. (2024). A Study of First-Order Methods with a Deterministic Relative-Error Gradient Oracle. Proceedings of the 41st International Conference on Machine Learning, in Proceedings of Machine Learning Research 235:17313-17332 Available from https://proceedings.mlr.press/v235/hallak24a.html.

Related Material