ECCV 2016
Is Faster R-CNN Doing Well for Pedestrian Detection?
Liliang Zhang, Liang Lin*, Xiaodan Liang, Kaiming He
ECCV 2016


Detecting pedestrian has been arguably addressed as a special topic beyond general object detection. Although recent deep learning object detectors such as Fast/Faster R-CNN [1,2] have shown excellent performance for general object detection, they have limited success for detecting pedestrian, and previous leading pedestrian detectors were in general hybrid methods combining hand-crafted and deep convolutional features. In this paper, we investigate issues involving Faster R-CNN [2] for pedestrian detection. We discover that the Region Proposal Network (RPN) in Faster R-CNN indeed performs well as a stand-alone pedestrian detector, but surprisingly, the downstream classifier degrades the results. We argue that two reasons account for the unsatisfactory accuracy: (i) insufficient resolution of feature maps for handling small instances, and (ii) lack of any bootstrapping strategy for mining hard negative examples. Driven by these observations, we propose a very simple but effective baseline for pedestrian detection, using an RPN followed by boosted forests on shared, high-resolution convolutional feature maps. We comprehensively evaluate this method on several benchmarks (Caltech, INRIA, ETH, and KITTI), presenting competitive accuracy and good speed. Code will be made publicly available.







Fig.1: Comparisons on the Caltech set (legends indicate MR).


Fig.2: Comparisons on the Caltech set using IoU > 0.7 to determine True Positives (legends indicate MR).


Fig.3: Comparisons on the Caltech-New set (legends indicate MR−2 (MR−4)).


Fig.4: Comparisons on the INRIA dataset (legends indicate MR).


Fig.5: Comparisons on the ETH dataset (legends indicate MR).


Table 1: Comparisons on the KITTI dataset collected at the time of submission (Feb 2016). The timing records are collected from the KITTI leaderboard. †: region proposal running time ignored (estimated 2s).






  1.  Ross Girshick. Fast R-CNN. In IEEE International Conference on Computer Vision (ICCV), 2015. 
  2.  Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. Faster R-CNN: Towards real-time object detection with region proposal networks. In Neural Information Processing Systems (NIPS), 2015.