Optimizing the Learning Rate for the Online Training of Neural Networks

Lucas Cazzonelli; Cedric Kulbach; Steffen Thoma

Optimizing the Learning Rate for the Online Training of Neural Networks

Lucas Cazzonelli, Cedric Kulbach, Steffen Thoma

Proceedings of The 3rd Conference on Lifelong Learning Agents, PMLR 274:798-814, 2025.

Abstract

Efficient training via gradient-based optimization techniques is an essential building block to the success of artificial neural networks. Extensive research on the impact and the effective estimation of an appropriate learning rate has partly enabled these techniques. Despite the proliferation of data streams generated by IoT devices, digital platforms, and many more, previous research has been primarily focused on batch learning, which assumes that all training data is available a priori. However, characteristics such as the gradual emergence of data and distributional shifts, also known as \textit{concept drift}, pose additional challenges. Therefore, the findings on batch learning may not apply to streaming environments, where the underlying model needs to adapt on the fly each time a new data instance appears. In this work, we seek to address this knowledge gap by evaluating typical learning rate schedules and techniques for adapting these schedules to concept drift (i), and by investigating the effectiveness of optimization techniques with adaptive step sizes in the context of stream-based neural network training (ii). We also introduce a novel \textit{pre-tuning} approach, which we find improves the effectiveness of learning rate tuning performed prior to the stream learning process compared to conventional tuning (iii).

Cite this Paper

BibTeX

@InProceedings{pmlr-v274-cazzonelli25a,
  title = 	 {Optimizing the Learning Rate for the Online Training of Neural Networks},
  author =       {Cazzonelli, Lucas and Kulbach, Cedric and Thoma, Steffen},
  booktitle = 	 {Proceedings of The 3rd Conference on Lifelong Learning Agents},
  pages = 	 {798--814},
  year = 	 {2025},
  editor = 	 {Lomonaco, Vincenzo and Melacci, Stefano and Tuytelaars, Tinne and Chandar, Sarath and Pascanu, Razvan},
  volume = 	 {274},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {29 Jul--01 Aug},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v274/main/assets/cazzonelli25a/cazzonelli25a.pdf},
  url = 	 {https://proceedings.mlr.press/v274/cazzonelli25a.html},
  abstract = 	 {Efficient training via gradient-based optimization techniques is an essential building block to the success of artificial neural networks. Extensive research on the impact and the effective estimation of an appropriate learning rate has partly enabled these techniques. Despite the proliferation of data streams generated by IoT devices, digital platforms, and many more, previous research has been primarily focused on batch learning, which assumes that all training data is available a priori. However, characteristics such as the gradual emergence of data and distributional shifts, also known as \textit{concept drift}, pose additional challenges. Therefore, the findings on batch learning may not apply to streaming environments, where the underlying model needs to adapt on the fly each time a new data instance appears. In this work, we seek to address this knowledge gap by evaluating typical learning rate schedules and techniques for adapting these schedules to concept drift (i), and by investigating the effectiveness of optimization techniques with adaptive step sizes in the context of stream-based neural network training (ii). We also introduce a novel \textit{pre-tuning} approach, which we find improves the effectiveness of learning rate tuning performed prior to the stream learning process compared to conventional tuning (iii).}
}

Endnote

%0 Conference Paper
%T Optimizing the Learning Rate for the Online Training of Neural Networks
%A Lucas Cazzonelli
%A Cedric Kulbach
%A Steffen Thoma
%B Proceedings of The 3rd Conference on Lifelong Learning Agents
%C Proceedings of Machine Learning Research
%D 2025
%E Vincenzo Lomonaco
%E Stefano Melacci
%E Tinne Tuytelaars
%E Sarath Chandar
%E Razvan Pascanu	
%F pmlr-v274-cazzonelli25a
%I PMLR
%P 798--814
%U https://proceedings.mlr.press/v274/cazzonelli25a.html
%V 274
%X Efficient training via gradient-based optimization techniques is an essential building block to the success of artificial neural networks. Extensive research on the impact and the effective estimation of an appropriate learning rate has partly enabled these techniques. Despite the proliferation of data streams generated by IoT devices, digital platforms, and many more, previous research has been primarily focused on batch learning, which assumes that all training data is available a priori. However, characteristics such as the gradual emergence of data and distributional shifts, also known as \textit{concept drift}, pose additional challenges. Therefore, the findings on batch learning may not apply to streaming environments, where the underlying model needs to adapt on the fly each time a new data instance appears. In this work, we seek to address this knowledge gap by evaluating typical learning rate schedules and techniques for adapting these schedules to concept drift (i), and by investigating the effectiveness of optimization techniques with adaptive step sizes in the context of stream-based neural network training (ii). We also introduce a novel \textit{pre-tuning} approach, which we find improves the effectiveness of learning rate tuning performed prior to the stream learning process compared to conventional tuning (iii).

APA

Cazzonelli, L., Kulbach, C. & Thoma, S.. (2025). Optimizing the Learning Rate for the Online Training of Neural Networks. Proceedings of The 3rd Conference on Lifelong Learning Agents, in Proceedings of Machine Learning Research 274:798-814 Available from https://proceedings.mlr.press/v274/cazzonelli25a.html.

Related Material

Download PDF