TY - JOUR
T1 - Multiple human upper bodies detection via candidate-region convolutional neural network
AU - Zhu, Aichun
AU - Wang, Tian
AU - Qiao, Tong
N1 - Publisher Copyright:
© 2018, Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2019/6/30
Y1 - 2019/6/30
N2 - Upper body detection on images is a challenging task in practical application scenarios and shares all the difficulties of object detection. This paper focuses on the problems of the multiple upper bodies, including the diversity of appearances, the various object scales, and the frequent occlusions. To address these problems, we divide the upper body detection into two stages to form a Candidate-Region Convolutional Neural Network(CR-CNN). In the upper body candidate generation stage, a deep hierarchical model is proposed. This model is built by a graphical model that contains the appearance model and deformable model. The appearance model is built based on the feature maps in a CNN, and the deformable model is defined by each pair of connected parts to compute the relative spatial information in the graphical model. In the upper body candidate refining stage, the detected bounding boxes serve as the candidate regions and refined in the CR-CNN. Moreover, multiple convolutional features are introduced into the CR-CNN to provide the local information and contextual information. The proposed method is compared with the state of the art on the TV Human Interaction (TVHI) dataset and HollywoodHeads dataset. The experimental results demonstrate the effectiveness of the proposed method.
AB - Upper body detection on images is a challenging task in practical application scenarios and shares all the difficulties of object detection. This paper focuses on the problems of the multiple upper bodies, including the diversity of appearances, the various object scales, and the frequent occlusions. To address these problems, we divide the upper body detection into two stages to form a Candidate-Region Convolutional Neural Network(CR-CNN). In the upper body candidate generation stage, a deep hierarchical model is proposed. This model is built by a graphical model that contains the appearance model and deformable model. The appearance model is built based on the feature maps in a CNN, and the deformable model is defined by each pair of connected parts to compute the relative spatial information in the graphical model. In the upper body candidate refining stage, the detected bounding boxes serve as the candidate regions and refined in the CR-CNN. Moreover, multiple convolutional features are introduced into the CR-CNN to provide the local information and contextual information. The proposed method is compared with the state of the art on the TV Human Interaction (TVHI) dataset and HollywoodHeads dataset. The experimental results demonstrate the effectiveness of the proposed method.
KW - Candidate regions
KW - Convolutional neural network
KW - Upper body detection
UR - https://www.scopus.com/pages/publications/85058481821
U2 - 10.1007/s11042-018-6964-7
DO - 10.1007/s11042-018-6964-7
M3 - 文章
AN - SCOPUS:85058481821
SN - 1380-7501
VL - 78
SP - 16077
EP - 16096
JO - Multimedia Tools and Applications
JF - Multimedia Tools and Applications
IS - 12
ER -