Disentangling the Causes of Plasticity Loss in Neural Networks

Clare Lyle; Zeyu Zheng; Khimya Khetarpal; Hado van Hasselt; Razvan Pascanu; James Martens; Will Dabney

Disentangling the Causes of Plasticity Loss in Neural Networks

Clare Lyle, Zeyu Zheng, Khimya Khetarpal, Hado van Hasselt, Razvan Pascanu, James Martens, Will Dabney

Proceedings of The 3rd Conference on Lifelong Learning Agents, PMLR 274:750-783, 2025.

Abstract

Underpinning the past decades of work on the design, initialization, and optimization of neural networks is a seemingly innocuous assumption: that the network is trained on a stationary data distribution. In settings where this assumption is violated, e.g. deep reinforcement learning, learning algorithms become unstable and brittle with respect to hyperparameters and even random seeds. One factor driving this instability is the loss of plasticity, meaning that updating the network’s predictions in response to new information becomes more difficult as training progresses. While many recent works provide analyses and partial solutions to this phenomenon, a fundamental question remains unanswered: to what extent do known mechanisms of plasticity loss overlap, and how can mitigation strategies be combined to best maintain the trainability of a network? This paper addresses these questions, showing that loss of plasticity can be decomposed into multiple independent mechanisms and that, while intervening on any single mechanism is insufficient to avoid the loss of plasticity in all cases, intervening on multiple mechanisms in conjunction results in highly robust learning algorithms. We show that a combination of layer normalization and weight decay is highly effective at maintaining plasticity in a variety of synthetic nonstationary learning tasks, and further demonstrate its effectiveness on naturally arising nonstationarities, including reinforcement learning in the Arcade Learning Environment.

Cite this Paper

BibTeX

@InProceedings{pmlr-v274-lyle25a,
  title = 	 {Disentangling the Causes of Plasticity Loss in Neural Networks},
  author =       {Lyle, Clare and Zheng, Zeyu and Khetarpal, Khimya and Hasselt, Hado van and Pascanu, Razvan and Martens, James and Dabney, Will},
  booktitle = 	 {Proceedings of The 3rd Conference on Lifelong Learning Agents},
  pages = 	 {750--783},
  year = 	 {2025},
  editor = 	 {Lomonaco, Vincenzo and Melacci, Stefano and Tuytelaars, Tinne and Chandar, Sarath and Pascanu, Razvan},
  volume = 	 {274},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {29 Jul--01 Aug},
  publisher =    {PMLR},
  pdf = 	 {https://raw.githubusercontent.com/mlresearch/v274/main/assets/lyle25a/lyle25a.pdf},
  url = 	 {https://proceedings.mlr.press/v274/lyle25a.html},
  abstract = 	 {Underpinning the past decades of work on the design, initialization, and optimization of neural networks is a seemingly innocuous assumption: that the network is trained on a stationary data distribution. In settings where this assumption is violated, e.g. deep reinforcement learning, learning algorithms become unstable and brittle with respect to hyperparameters and even random seeds. One factor driving this instability is the loss of plasticity, meaning that updating the network’s predictions in response to new information becomes more difficult as training progresses. While many recent works provide analyses and partial solutions to this phenomenon, a fundamental question remains unanswered: to what extent do known mechanisms of plasticity loss overlap, and how can mitigation strategies be combined to best maintain the trainability of a network? This paper addresses these questions, showing that loss of plasticity can be decomposed into multiple independent mechanisms and that, while intervening on any single mechanism is insufficient to avoid the loss of plasticity in all cases, intervening on multiple mechanisms in conjunction results in highly robust learning algorithms. We show that a combination of layer normalization and weight decay is highly effective at maintaining plasticity in a variety of synthetic nonstationary learning tasks, and further demonstrate its effectiveness on naturally arising nonstationarities, including reinforcement learning in the Arcade Learning Environment.}
}

Endnote

%0 Conference Paper
%T Disentangling the Causes of Plasticity Loss in Neural Networks
%A Clare Lyle
%A Zeyu Zheng
%A Khimya Khetarpal
%A Hado van Hasselt
%A Razvan Pascanu
%A James Martens
%A Will Dabney
%B Proceedings of The 3rd Conference on Lifelong Learning Agents
%C Proceedings of Machine Learning Research
%D 2025
%E Vincenzo Lomonaco
%E Stefano Melacci
%E Tinne Tuytelaars
%E Sarath Chandar
%E Razvan Pascanu	
%F pmlr-v274-lyle25a
%I PMLR
%P 750--783
%U https://proceedings.mlr.press/v274/lyle25a.html
%V 274
%X Underpinning the past decades of work on the design, initialization, and optimization of neural networks is a seemingly innocuous assumption: that the network is trained on a stationary data distribution. In settings where this assumption is violated, e.g. deep reinforcement learning, learning algorithms become unstable and brittle with respect to hyperparameters and even random seeds. One factor driving this instability is the loss of plasticity, meaning that updating the network’s predictions in response to new information becomes more difficult as training progresses. While many recent works provide analyses and partial solutions to this phenomenon, a fundamental question remains unanswered: to what extent do known mechanisms of plasticity loss overlap, and how can mitigation strategies be combined to best maintain the trainability of a network? This paper addresses these questions, showing that loss of plasticity can be decomposed into multiple independent mechanisms and that, while intervening on any single mechanism is insufficient to avoid the loss of plasticity in all cases, intervening on multiple mechanisms in conjunction results in highly robust learning algorithms. We show that a combination of layer normalization and weight decay is highly effective at maintaining plasticity in a variety of synthetic nonstationary learning tasks, and further demonstrate its effectiveness on naturally arising nonstationarities, including reinforcement learning in the Arcade Learning Environment.

APA

Lyle, C., Zheng, Z., Khetarpal, K., Hasselt, H.v., Pascanu, R., Martens, J. & Dabney, W.. (2025). Disentangling the Causes of Plasticity Loss in Neural Networks. Proceedings of The 3rd Conference on Lifelong Learning Agents, in Proceedings of Machine Learning Research 274:750-783 Available from https://proceedings.mlr.press/v274/lyle25a.html.

Related Material

Download PDF