TY - JOUR
T1 - R-SSD
T2 - refined single shot multibox detector for pedestrian detection
AU - Yan, Chaoqi
AU - Zhang, Hong
AU - Li, Xuliang
AU - Yuan, Ding
N1 - Publisher Copyright:
© 2022, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2022/7
Y1 - 2022/7
N2 - Pedestrian detection is a critical task in the field of computer vision, and it has made considerable progress with the help of Convnets. However, a persistent crucial problem is that small-scale pedestrians are notoriously difficult to detect because of the introduction of weak contrast and blurred boundaries in real-world scenarios. In this paper, we present a simple and compact detection method for detecting multi-scale pedestrians, which is especially suitable for detecting small-scale pedestrians that are not easily recognized in images or videos. We first interpret convolutional neural network (CNN) channel features, explore the detection performance of different feature fusion methods, and propose a novel two-level feature fusion strategy specially designed for small-scale pedestrians. Moreover, a sub-network named “prediction module” is injected into the framework to improve the general performance without any bells and whistles. In addition, we propose an adaptive loss that adds an adaptive adjustment coefficient to the Smooth L1 loss function to enhance its robustness to pedestrian detection tasks. Using these methods synthetically, we achieve state-of-the-art detection performance on the Caltech pedestrian dataset under three evaluation protocols; particularly, the performance of small-scale pedestrians under “Far” evaluation setting is improved (miss rate decreases from 70.97% to 60.09%). Further, the proposed method achieves a competitive speed-accuracy trade-off with 0.31 second per image of 1024×2048 pixels on the CityPersons dataset.
AB - Pedestrian detection is a critical task in the field of computer vision, and it has made considerable progress with the help of Convnets. However, a persistent crucial problem is that small-scale pedestrians are notoriously difficult to detect because of the introduction of weak contrast and blurred boundaries in real-world scenarios. In this paper, we present a simple and compact detection method for detecting multi-scale pedestrians, which is especially suitable for detecting small-scale pedestrians that are not easily recognized in images or videos. We first interpret convolutional neural network (CNN) channel features, explore the detection performance of different feature fusion methods, and propose a novel two-level feature fusion strategy specially designed for small-scale pedestrians. Moreover, a sub-network named “prediction module” is injected into the framework to improve the general performance without any bells and whistles. In addition, we propose an adaptive loss that adds an adaptive adjustment coefficient to the Smooth L1 loss function to enhance its robustness to pedestrian detection tasks. Using these methods synthetically, we achieve state-of-the-art detection performance on the Caltech pedestrian dataset under three evaluation protocols; particularly, the performance of small-scale pedestrians under “Far” evaluation setting is improved (miss rate decreases from 70.97% to 60.09%). Further, the proposed method achieves a competitive speed-accuracy trade-off with 0.31 second per image of 1024×2048 pixels on the CityPersons dataset.
KW - Computer vision
KW - Convnets
KW - Feature fusion
KW - Loss function
KW - Pedestrian detection
UR - https://www.scopus.com/pages/publications/85123182930
U2 - 10.1007/s10489-021-02798-1
DO - 10.1007/s10489-021-02798-1
M3 - 文章
AN - SCOPUS:85123182930
SN - 0924-669X
VL - 52
SP - 10430
EP - 10447
JO - Applied Intelligence
JF - Applied Intelligence
IS - 9
ER -