[edit]
Cross-Level Feature Relocation: Mitigating Information Loss in Cross-Layer Feature Fusion for Crowd Counting
Proceedings of the 16th Asian Conference on Machine Learning, PMLR 260:1352-1367, 2025.
Abstract
In crowd counting, significant challenges persist due to scale variation, occlusion, and complex scene interference. Merging feature maps from different levels of the backbone network is an intuitive and efficient approach to addressing these issues. However, existing multi-scale merging algorithms often overlook a critical aspect: feature maps at different levels typically have varying resolutions, and traditional interpolation-based methods for feature fusion result in significant information loss, limiting the algorithm’s multi-scale perception capability. To address this issue, we propose the Cross-Level Feature Relocation Module (CFRM), which regresses features across different levels into a unified representation space and utilizes a cross-level attention mechanism to transfer complementary information from low-resolution to high-resolution feature maps, significantly enhancing effective information utilization. Based on CFRM, we introduce the Cross-Level Feature Relocation Network (CFRNet), which exhibits strong multi-scale perception capabilities. Extensive experiments on five datasets and comprehensive ablation studies demonstrate the effectiveness of CFRM.