Guarantees for Tuning the Step Size using a Learning-to-Learn Approach

Xiang Wang; Shuai Yuan; Chenwei Wu; Rong Ge

Guarantees for Tuning the Step Size using a Learning-to-Learn Approach

Xiang Wang, Shuai Yuan, Chenwei Wu, Rong Ge

Proceedings of the 38th International Conference on Machine Learning, PMLR 139:10981-10990, 2021.

Abstract

Choosing the right parameters for optimization algorithms is often the key to their success in practice. Solving this problem using a learning-to-learn approach—using meta-gradient descent on a meta-objective based on the trajectory that the optimizer generates—was recently shown to be effective. However, the meta-optimization problem is difficult. In particular, the meta-gradient can often explode/vanish, and the learned optimizer may not have good generalization performance if the meta-objective is not chosen carefully. In this paper we give meta-optimization guarantees for the learning-to-learn approach on a simple problem of tuning the step size for quadratic loss. Our results show that the naïve objective suffers from meta-gradient explosion/vanishing problem. Although there is a way to design the meta-objective so that the meta-gradient remains polynomially bounded, computing the meta-gradient directly using backpropagation leads to numerical issues. We also characterize when it is necessary to compute the meta-objective on a separate validation set to ensure the generalization performance of the learned optimizer. Finally, we verify our results empirically and show that a similar phenomenon appears even for more complicated learned optimizers parametrized by neural networks.

Cite this Paper

BibTeX

@InProceedings{pmlr-v139-wang21ac,
  title = 	 {Guarantees for Tuning the Step Size using a Learning-to-Learn Approach},
  author =       {Wang, Xiang and Yuan, Shuai and Wu, Chenwei and Ge, Rong},
  booktitle = 	 {Proceedings of the 38th International Conference on Machine Learning},
  pages = 	 {10981--10990},
  year = 	 {2021},
  editor = 	 {Meila, Marina and Zhang, Tong},
  volume = 	 {139},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {18--24 Jul},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v139/wang21ac/wang21ac.pdf},
  url = 	 {https://proceedings.mlr.press/v139/wang21ac.html},
  abstract = 	 {Choosing the right parameters for optimization algorithms is often the key to their success in practice. Solving this problem using a learning-to-learn approach—using meta-gradient descent on a meta-objective based on the trajectory that the optimizer generates—was recently shown to be effective. However, the meta-optimization problem is difficult. In particular, the meta-gradient can often explode/vanish, and the learned optimizer may not have good generalization performance if the meta-objective is not chosen carefully. In this paper we give meta-optimization guarantees for the learning-to-learn approach on a simple problem of tuning the step size for quadratic loss. Our results show that the naïve objective suffers from meta-gradient explosion/vanishing problem. Although there is a way to design the meta-objective so that the meta-gradient remains polynomially bounded, computing the meta-gradient directly using backpropagation leads to numerical issues. We also characterize when it is necessary to compute the meta-objective on a separate validation set to ensure the generalization performance of the learned optimizer. Finally, we verify our results empirically and show that a similar phenomenon appears even for more complicated learned optimizers parametrized by neural networks.}
}

Endnote

%0 Conference Paper
%T Guarantees for Tuning the Step Size using a Learning-to-Learn Approach
%A Xiang Wang
%A Shuai Yuan
%A Chenwei Wu
%A Rong Ge
%B Proceedings of the 38th International Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2021
%E Marina Meila
%E Tong Zhang	
%F pmlr-v139-wang21ac
%I PMLR
%P 10981--10990
%U https://proceedings.mlr.press/v139/wang21ac.html
%V 139
%X Choosing the right parameters for optimization algorithms is often the key to their success in practice. Solving this problem using a learning-to-learn approach—using meta-gradient descent on a meta-objective based on the trajectory that the optimizer generates—was recently shown to be effective. However, the meta-optimization problem is difficult. In particular, the meta-gradient can often explode/vanish, and the learned optimizer may not have good generalization performance if the meta-objective is not chosen carefully. In this paper we give meta-optimization guarantees for the learning-to-learn approach on a simple problem of tuning the step size for quadratic loss. Our results show that the naïve objective suffers from meta-gradient explosion/vanishing problem. Although there is a way to design the meta-objective so that the meta-gradient remains polynomially bounded, computing the meta-gradient directly using backpropagation leads to numerical issues. We also characterize when it is necessary to compute the meta-objective on a separate validation set to ensure the generalization performance of the learned optimizer. Finally, we verify our results empirically and show that a similar phenomenon appears even for more complicated learned optimizers parametrized by neural networks.

APA

Wang, X., Yuan, S., Wu, C. & Ge, R.. (2021). Guarantees for Tuning the Step Size using a Learning-to-Learn Approach. Proceedings of the 38th International Conference on Machine Learning, in Proceedings of Machine Learning Research 139:10981-10990 Available from https://proceedings.mlr.press/v139/wang21ac.html.

Guarantees for Tuning the Step Size using a Learning-to-Learn Approach

Abstract

Cite this Paper

Related Material