Teach YOLO to Remember: A Self-Distillation Approach for Continual Object Detection

Riccardo De Monte, Davide Dalle Pezze, Gian Antonio Susto
Proceedings of The 4th Conference on Lifelong Learning Agents, PMLR 330:487-499, 2026.

Abstract

Real-time object detectors like YOLO achieve exceptional performance when trained on large datasets for multiple epochs. However, in real-world scenarios where data arrives incrementally, neural networks suffer from catastrophic forgetting, leading to a loss of previously learned knowledge. To address this, prior research has explored strategies for Class Incremental Learning (CIL) in Continual Learning for Object Detection (CLOD), with most approaches focusing on two-stage object detectors. However, existing work suggests that Learning without Forgetting (LwF) may be ineffective for one-stage anchor-free detectors like YOLO due to noisy regression outputs, which risk transferring corrupted knowledge. In this work, we introduce YOLO LwF, a self-distillation approach tailored for YOLO-based continual object detection. We demonstrate that when coupled with a replay memory, YOLO LwF significantly mitigates forgetting. Compared to previous approaches, it achieves state-of-the-art performance, improving mAP by +2.1% and +2.9% on the VOC and COCO benchmarks, respectively.

Cite this Paper


BibTeX
@InProceedings{pmlr-v330-monte26a, title = {Teach YOLO to Remember: A Self-Distillation Approach for Continual Object Detection}, author = {Monte, Riccardo De and Pezze, Davide Dalle and Susto, Gian Antonio}, booktitle = {Proceedings of The 4th Conference on Lifelong Learning Agents}, pages = {487--499}, year = {2026}, editor = {Chandar, Sarath and Pascanu, Razvan and Eaton, Eric and Liu, Bing and Mahmood, Rupam and Rannen-Triki, Amal}, volume = {330}, series = {Proceedings of Machine Learning Research}, month = {11--14 Aug}, publisher = {PMLR}, pdf = {https://raw.githubusercontent.com/mlresearch/v330/main/assets/monte26a/monte26a.pdf}, url = {https://proceedings.mlr.press/v330/monte26a.html}, abstract = {Real-time object detectors like YOLO achieve exceptional performance when trained on large datasets for multiple epochs. However, in real-world scenarios where data arrives incrementally, neural networks suffer from catastrophic forgetting, leading to a loss of previously learned knowledge. To address this, prior research has explored strategies for Class Incremental Learning (CIL) in Continual Learning for Object Detection (CLOD), with most approaches focusing on two-stage object detectors. However, existing work suggests that Learning without Forgetting (LwF) may be ineffective for one-stage anchor-free detectors like YOLO due to noisy regression outputs, which risk transferring corrupted knowledge. In this work, we introduce YOLO LwF, a self-distillation approach tailored for YOLO-based continual object detection. We demonstrate that when coupled with a replay memory, YOLO LwF significantly mitigates forgetting. Compared to previous approaches, it achieves state-of-the-art performance, improving mAP by +2.1% and +2.9% on the VOC and COCO benchmarks, respectively.} }
Endnote
%0 Conference Paper %T Teach YOLO to Remember: A Self-Distillation Approach for Continual Object Detection %A Riccardo De Monte %A Davide Dalle Pezze %A Gian Antonio Susto %B Proceedings of The 4th Conference on Lifelong Learning Agents %C Proceedings of Machine Learning Research %D 2026 %E Sarath Chandar %E Razvan Pascanu %E Eric Eaton %E Bing Liu %E Rupam Mahmood %E Amal Rannen-Triki %F pmlr-v330-monte26a %I PMLR %P 487--499 %U https://proceedings.mlr.press/v330/monte26a.html %V 330 %X Real-time object detectors like YOLO achieve exceptional performance when trained on large datasets for multiple epochs. However, in real-world scenarios where data arrives incrementally, neural networks suffer from catastrophic forgetting, leading to a loss of previously learned knowledge. To address this, prior research has explored strategies for Class Incremental Learning (CIL) in Continual Learning for Object Detection (CLOD), with most approaches focusing on two-stage object detectors. However, existing work suggests that Learning without Forgetting (LwF) may be ineffective for one-stage anchor-free detectors like YOLO due to noisy regression outputs, which risk transferring corrupted knowledge. In this work, we introduce YOLO LwF, a self-distillation approach tailored for YOLO-based continual object detection. We demonstrate that when coupled with a replay memory, YOLO LwF significantly mitigates forgetting. Compared to previous approaches, it achieves state-of-the-art performance, improving mAP by +2.1% and +2.9% on the VOC and COCO benchmarks, respectively.
APA
Monte, R.D., Pezze, D.D. & Susto, G.A.. (2026). Teach YOLO to Remember: A Self-Distillation Approach for Continual Object Detection. Proceedings of The 4th Conference on Lifelong Learning Agents, in Proceedings of Machine Learning Research 330:487-499 Available from https://proceedings.mlr.press/v330/monte26a.html.

Related Material