ZoomNet: Deep Aggregation Learning for High-Performance Small Pedestrian Detection
Proceedings of The 10th Asian Conference on Machine Learning, PMLR 95:486-501, 2018.
It remains very challenging for a single deep model to detect pedestrians of different sizes appears in an image. One typical remedy for the small pedestrian detection is to up-sample the input and pass it to the network multiple times. Unfortunately this strategy not only exponentially increases the computational cost but also probably impairs the model effectiveness. In this work, we present a deep architecture, refereed to as ZoomNet, which performs small pedestrian detection by deep aggregation learning without up-sampling the input. ZoomNet learns and aggregates deep feature representations at multiple levels and retains the spatial information of the pedestrian from different scales. ZoomNet also learns to cultivate the feature representations from the classification task to the detection task and obtains further performance improvements. Extensive experimental results demonstrate the state-of-the-art performance of ZoomNet. The source code of this work will be made public available to facilitate further studies on this problem.