[edit]
Enhancing Thermal Image Object Detection using Spatial Edge-aware Attention and Self-supervision Pretext
Proceedings of the The 39th Canadian Conference on Artificial Intelligence, PMLR 318:538-551, 2026.
Abstract
Thermal cameras offer robust sensing for object detection in low-visibility driving conditions, but thermal images often suffer lower resolution and weaker object boundaries than RGB imagery. This paper presents SEA-YOLO-E (Spatial Edge Attention YOLO-E), an enhanced single-modality thermal object detector that integrates a SEA mechanism and semi-supervised learning to overcome these challenges. First, we introduce the SEA-YOLO architecture, which embeds an Edge Extractor and a novel SEA module into a YOLOv8 backbone to emphasize object boundaries and improve detection accuracy in thermal domains. Bases on it, we extend SEA-YOLO with a semi-supervised learning paradigm: a self-supervised rotation prediction pretext task leverages unlabeled infrared images to learn general feature representations, and synthetic thermal data mitigates class imbalance in training. The proposed two-phase training (self-supervised pretraining followed by supervised fine-tuning) significantly boosts detection performance. Experiments on multiple thermal driving datasets demonstrate that SEA-YOLO-E achieves state-of-the-art results, with improvements of up to 9–12% in mAP over existing detectors. Notably, our edge-enhanced attention and rotation-pretrained model outperforms recent multi-modal RGB-thermal detectors while using only thermal input.