Debiasing a First-order Heuristic for Approximate Bi-level Optimization

Valerii Likhosherstov; Xingyou Song; Krzysztof Choromanski; Jared Q Davis; Adrian Weller

Debiasing a First-order Heuristic for Approximate Bi-level Optimization

Valerii Likhosherstov, Xingyou Song, Krzysztof Choromanski, Jared Q Davis, Adrian Weller

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:6621-6630, 2021.

Abstract

Approximate bi-level optimization (ABLO) consists of (outer-level) optimization problems, involving numerical (inner-level) optimization loops. While ABLO has many applications across deep learning, it suffers from time and memory complexity proportional to the length

$r$ of its inner optimization loop. To address this complexity, an earlier first-order method (FOM) was proposed as a heuristic which omits second derivative terms, yielding significant speed gains and requiring only constant memory. Despite FOM’s popularity, there is a lack of theoretical understanding of its convergence properties. We contribute by theoretically characterizing FOM’s gradient bias under mild assumptions. We further demonstrate a rich family of examples where FOM-based SGD does not converge to a stationary point of the ABLO objective. We address this concern by proposing an unbiased FOM (UFOM) enjoying constant memory complexity as a function of

$r$ . We characterize the introduced time-variance tradeoff, demonstrate convergence bounds, and find an optimal UFOM for a given ABLO problem. Finally, we propose an efficient adaptive UFOM scheme.

Cite this Paper

BibTeX


@InProceedings{pmlr-v139-likhosherstov21a,
  title = 	 {Debiasing a First-order Heuristic for Approximate Bi-level Optimization},
  author =       {Likhosherstov, Valerii and Song, Xingyou and Choromanski, Krzysztof and Davis, Jared Q and Weller, Adrian},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {6621--6630},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/likhosherstov21a/likhosherstov21a.pdf},
  url = 	 {https://proceedings.mlr.press/v139/likhosherstov21a.html},
  abstract = 	 {Approximate bi-level optimization (ABLO) consists of (outer-level) optimization problems, involving numerical (inner-level) optimization loops. While ABLO has many applications across deep learning, it suffers from time and memory complexity proportional to the length $r$ of its inner optimization loop. To address this complexity, an earlier first-order method (FOM) was proposed as a heuristic which omits second derivative terms, yielding significant speed gains and requiring only constant memory. Despite FOM’s popularity, there is a lack of theoretical understanding of its convergence properties. We contribute by theoretically characterizing FOM’s gradient bias under mild assumptions. We further demonstrate a rich family of examples where FOM-based SGD does not converge to a stationary point of the ABLO objective. We address this concern by proposing an unbiased FOM (UFOM) enjoying constant memory complexity as a function of $r$. We characterize the introduced time-variance tradeoff, demonstrate convergence bounds, and find an optimal UFOM for a given ABLO problem. Finally, we propose an efficient adaptive UFOM scheme.}
}

Endnote

%0 Conference Paper
%T Debiasing a First-order Heuristic for Approximate Bi-level Optimization
%A Valerii Likhosherstov
%A Xingyou Song
%A Krzysztof Choromanski
%A Jared Q Davis
%A Adrian Weller
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-likhosherstov21a
%I PMLR
%P 6621--6630
%U https://proceedings.mlr.press/v139/likhosherstov21a.html
%V 139
%X Approximate bi-level optimization (ABLO) consists of (outer-level) optimization problems, involving numerical (inner-level) optimization loops. While ABLO has many applications across deep learning, it suffers from time and memory complexity proportional to the length $r$ of its inner optimization loop. To address this complexity, an earlier first-order method (FOM) was proposed as a heuristic which omits second derivative terms, yielding significant speed gains and requiring only constant memory. Despite FOM’s popularity, there is a lack of theoretical understanding of its convergence properties. We contribute by theoretically characterizing FOM’s gradient bias under mild assumptions. We further demonstrate a rich family of examples where FOM-based SGD does not converge to a stationary point of the ABLO objective. We address this concern by proposing an unbiased FOM (UFOM) enjoying constant memory complexity as a function of $r$. We characterize the introduced time-variance tradeoff, demonstrate convergence bounds, and find an optimal UFOM for a given ABLO problem. Finally, we propose an efficient adaptive UFOM scheme.

APA


Likhosherstov, V., Song, X., Choromanski, K., Davis, J.Q. & Weller, A.. (2021). Debiasing a First-order Heuristic for Approximate Bi-level Optimization. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:6621-6630 Available from https://proceedings.mlr.press/v139/likhosherstov21a.html.

Debiasing a First-order Heuristic for Approximate Bi-level Optimization

Abstract

Cite this Paper

Related Material