Value Function based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems

Lucy L Gao; Jane Ye; Haian Yin; Shangzhi Zeng; Jin Zhang

Value Function based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems

Lucy L Gao, Jane Ye, Haian Yin, Shangzhi Zeng, Jin Zhang

Proceedings of the 39th International Conference on Machine Learning, PMLR 162:7164-7182, 2022.

Abstract

Existing gradient-based optimization methods for hyperparameter tuning can only guarantee theoretical convergence to stationary solutions when the bilevel program satisfies the condition that for fixed upper-level variables, the lower-level is strongly convex (LLSC) and smooth (LLS). This condition is not satisfied for bilevel programs arising from tuning hyperparameters in many machine learning algorithms. In this work, we develop a sequentially convergent Value Function based Difference-of-Convex Algorithm with inexactness (VF-iDCA). We then ask: can this algorithm achieve stationary solutions without LLSC and LLS assumptions? We provide a positive answer to this question for bilevel programs from a broad class of hyperparameter tuning applications. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed VF-iDCA when applied to tune hyperparameters.

Cite this Paper

BibTeX

@InProceedings{pmlr-v162-gao22j,
  title = 	 {Value Function based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems},
  author =       {Gao, Lucy L and Ye, Jane and Yin, Haian and Zeng, Shangzhi and Zhang, Jin},
  booktitle = 	 {Proceedings of the 39th International Conference on Machine Learning},
  pages = 	 {7164--7182},
  year = 	 {2022},
  editor = 	 {Chaudhuri, Kamalika and Jegelka, Stefanie and Song, Le and Szepesvari, Csaba and Niu, Gang and Sabato, Sivan},
  volume = 	 {162},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {17--23 Jul},
  publisher =    {PMLR},
  pdf = 	 {https://proceedings.mlr.press/v162/gao22j/gao22j.pdf},
  url = 	 {https://proceedings.mlr.press/v162/gao22j.html},
  abstract = 	 {Existing gradient-based optimization methods for hyperparameter tuning can only guarantee theoretical convergence to stationary solutions when the bilevel program satisfies the condition that for fixed upper-level variables, the lower-level is strongly convex (LLSC) and smooth (LLS). This condition is not satisfied for bilevel programs arising from tuning hyperparameters in many machine learning algorithms. In this work, we develop a sequentially convergent Value Function based Difference-of-Convex Algorithm with inexactness (VF-iDCA). We then ask: can this algorithm achieve stationary solutions without LLSC and LLS assumptions? We provide a positive answer to this question for bilevel programs from a broad class of hyperparameter tuning applications. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed VF-iDCA when applied to tune hyperparameters.}
}

Endnote

%0 Conference Paper
%T Value Function based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems
%A Lucy L Gao
%A Jane Ye
%A Haian Yin
%A Shangzhi Zeng
%A Jin Zhang
%B Proceedings of the 39th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2022
%E Kamalika Chaudhuri
%E Stefanie Jegelka
%E Le Song
%E Csaba Szepesvari
%E Gang Niu
%E Sivan Sabato	
%F pmlr-v162-gao22j
%I PMLR
%P 7164--7182
%U https://proceedings.mlr.press/v162/gao22j.html
%V 162
%X Existing gradient-based optimization methods for hyperparameter tuning can only guarantee theoretical convergence to stationary solutions when the bilevel program satisfies the condition that for fixed upper-level variables, the lower-level is strongly convex (LLSC) and smooth (LLS). This condition is not satisfied for bilevel programs arising from tuning hyperparameters in many machine learning algorithms. In this work, we develop a sequentially convergent Value Function based Difference-of-Convex Algorithm with inexactness (VF-iDCA). We then ask: can this algorithm achieve stationary solutions without LLSC and LLS assumptions? We provide a positive answer to this question for bilevel programs from a broad class of hyperparameter tuning applications. Extensive experiments justify our theoretical results and demonstrate the superiority of the proposed VF-iDCA when applied to tune hyperparameters.

APA

Gao, L.L., Ye, J., Yin, H., Zeng, S. & Zhang, J.. (2022). Value Function based Difference-of-Convex Algorithm for Bilevel Hyperparameter Selection Problems. Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research 162:7164-7182 Available from https://proceedings.mlr.press/v162/gao22j.html.

Related Material

Download PDF