ZoomNet: Deep Aggregation Learning for High-Performance Small Pedestrian Detection

Chong Shang; Haizhou Ai; Zijie Zhuang; Long Chen; Junliang Xing

ZoomNet: Deep Aggregation Learning for High-Performance Small Pedestrian Detection

Chong Shang, Haizhou Ai, Zijie Zhuang, Long Chen, Junliang Xing

Proceedings of The 10th Asian Conference on Machine Learning, PMLR 95:486-501, 2018.

Abstract

It remains very challenging for a single deep model to detect pedestrians of different sizes appears in an image. One typical remedy for the small pedestrian detection is to up-sample the input and pass it to the network multiple times. Unfortunately this strategy not only exponentially increases the computational cost but also probably impairs the model effectiveness. In this work, we present a deep architecture, refereed to as ZoomNet, which performs small pedestrian detection by deep aggregation learning without up-sampling the input. ZoomNet learns and aggregates deep feature representations at multiple levels and retains the spatial information of the pedestrian from different scales. ZoomNet also learns to cultivate the feature representations from the classification task to the detection task and obtains further performance improvements. Extensive experimental results demonstrate the state-of-the-art performance of ZoomNet. The source code of this work will be made public available to facilitate further studies on this problem.

Cite this Paper

BibTeX


@InProceedings{pmlr-v95-shang18a,
  title = 	 {ZoomNet: Deep Aggregation Learning for High-Performance Small Pedestrian Detection},
  author =       {Shang, Chong and Ai, Haizhou and Zhuang, Zijie and Chen, Long and Xing, Junliang},
  booktitle = 	 {Proceedings of The 10th Asian Conference on Machine Learning},
  pages = 	 {486--501},
  year = 	 {2018},
  editor = 	 {Zhu, Jun and Takeuchi, Ichiro},
  volume = 	 {95},
  series = 	 {Proceedings of Machine Learning Research},
  month = 	 {14--16 Nov},
  publisher =    {PMLR},
  pdf = 	 {http://proceedings.mlr.press/v95/shang18a/shang18a.pdf},
  url = 	 {https://proceedings.mlr.press/v95/shang18a.html},
  abstract = 	 {It remains very challenging for a single deep model to detect pedestrians of different sizes appears in an image. One typical remedy for the small pedestrian detection is to up-sample the input and pass it to the network multiple times. Unfortunately this strategy not only exponentially increases the computational cost but also probably impairs the model effectiveness. In this work, we present a deep architecture, refereed to as ZoomNet, which performs small pedestrian detection by deep aggregation learning without up-sampling the input. ZoomNet learns and aggregates deep feature representations at multiple levels and retains the spatial information of the pedestrian from different scales. ZoomNet also learns to cultivate the feature representations from the classification task to the detection task and obtains further performance improvements. Extensive experimental results demonstrate the state-of-the-art performance of ZoomNet. The source code of this work will be made public available to facilitate further studies on this problem.}
}

Endnote

%0 Conference Paper
%T ZoomNet: Deep Aggregation Learning for High-Performance Small Pedestrian Detection
%A Chong Shang
%A Haizhou Ai
%A Zijie Zhuang
%A Long Chen
%A Junliang Xing
%B Proceedings of The 10th Asian Conference on Machine Learning
%C Proceedings of Machine Learning Research
%D 2018
%E Jun Zhu
%E Ichiro Takeuchi	
%F pmlr-v95-shang18a
%I PMLR
%P 486--501
%U https://proceedings.mlr.press/v95/shang18a.html
%V 95
%X It remains very challenging for a single deep model to detect pedestrians of different sizes appears in an image. One typical remedy for the small pedestrian detection is to up-sample the input and pass it to the network multiple times. Unfortunately this strategy not only exponentially increases the computational cost but also probably impairs the model effectiveness. In this work, we present a deep architecture, refereed to as ZoomNet, which performs small pedestrian detection by deep aggregation learning without up-sampling the input. ZoomNet learns and aggregates deep feature representations at multiple levels and retains the spatial information of the pedestrian from different scales. ZoomNet also learns to cultivate the feature representations from the classification task to the detection task and obtains further performance improvements. Extensive experimental results demonstrate the state-of-the-art performance of ZoomNet. The source code of this work will be made public available to facilitate further studies on this problem.

APA


Shang, C., Ai, H., Zhuang, Z., Chen, L. & Xing, J.. (2018). ZoomNet: Deep Aggregation Learning for High-Performance Small Pedestrian Detection. Proceedings of The 10th Asian Conference on Machine Learning, in Proceedings of Machine Learning Research 95:486-501 Available from https://proceedings.mlr.press/v95/shang18a.html.

Related Material

Download PDF