TY - GEN
T1 - Pedestrian attribute recognition via hierarchical multi-task learning and relationship attention
AU - Gao, Lian
AU - Guo, Yuanfang
AU - Huang, Di
AU - Wang, Yunhong
N1 - Publisher Copyright:
© 2019 Association for Computing Machinery.
PY - 2019/10/15
Y1 - 2019/10/15
N2 - Pedestrian Attribute Recognition (PAR) is an important task in surveillance video analysis. In this paper, we propose a novel end-to-end hierarchical deep learning approach to PAR. The proposed network introduces semantic segmentation into PAR and formulates it as a multi-task learning problem, which brings in pixel-level supervision in feature learning for attribute localization. According to the spatial properties of local and global attributes, we present a two stage learning mechanism to decouple coarse attribute localization and fine attribute recognition into successive phases within a single model, which strengthens feature learning. Besides, we design an attribute relationship attention module to efficiently capture and emphasize the latent relations among different attributes, further enhancing the discriminative power of the feature. Extensive experiments are conducted and very competitive results are reached on the RAP and PETA databases, indicating the effectiveness and superiority of the proposed approach.
AB - Pedestrian Attribute Recognition (PAR) is an important task in surveillance video analysis. In this paper, we propose a novel end-to-end hierarchical deep learning approach to PAR. The proposed network introduces semantic segmentation into PAR and formulates it as a multi-task learning problem, which brings in pixel-level supervision in feature learning for attribute localization. According to the spatial properties of local and global attributes, we present a two stage learning mechanism to decouple coarse attribute localization and fine attribute recognition into successive phases within a single model, which strengthens feature learning. Besides, we design an attribute relationship attention module to efficiently capture and emphasize the latent relations among different attributes, further enhancing the discriminative power of the feature. Extensive experiments are conducted and very competitive results are reached on the RAP and PETA databases, indicating the effectiveness and superiority of the proposed approach.
KW - Deep learning
KW - Multi-task learning
KW - Pedestrian attribute recognition
KW - Visual attention
UR - https://www.scopus.com/pages/publications/85074838906
U2 - 10.1145/3343031.3351003
DO - 10.1145/3343031.3351003
M3 - 会议稿件
AN - SCOPUS:85074838906
T3 - MM 2019 - Proceedings of the 27th ACM International Conference on Multimedia
SP - 1340
EP - 1348
BT - MM 2019 - Proceedings of the 27th ACM International Conference on Multimedia
PB - Association for Computing Machinery, Inc
T2 - 27th ACM International Conference on Multimedia, MM 2019
Y2 - 21 October 2019 through 25 October 2019
ER -