TY - GEN
T1 - Online Multi-Object Tracking with Pose-Guided Object Location and Dual Self-Attention Network
AU - Zhang, Xin
AU - Wang, Shihao
AU - Yang, Yuanzhe
AU - Chu, Chengxiang
AU - Zhou, Zhong
N1 - Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - The recent trend in Multi-Object Tracking (MOT) is heading towards using deep learning to detect objects and extract features. Although tracking frameworks using detection network have achieved outstanding performance in object locating on MOT, it is still challenging for crowded occlusion. In this paper, we propose to alleviate this difficulty by combining bounding boxes from outputs of both object detection and pose estimation. The motivation behind generating redundant candidates is that object detection and pose estimation can complement each other in tracking scenes. In order to get optimal tracking objects from candidates, we present Soft-Pose-NMS. For similarity calculation, we design a Dual Self-Attention Network (DSAN) with the self-attention mechanism. The network generates the self-attention map that enables the network to focus on the object area of detection and tracklet images. Simultaneously, the network can extract the temporal self-attention feature map to suppress noisy images in the tracklet. Experiments are conducted on the MOT benchmark datasets. Results show that our tracker achieves competitive results and is state-of-the-art in half of the metrics.
AB - The recent trend in Multi-Object Tracking (MOT) is heading towards using deep learning to detect objects and extract features. Although tracking frameworks using detection network have achieved outstanding performance in object locating on MOT, it is still challenging for crowded occlusion. In this paper, we propose to alleviate this difficulty by combining bounding boxes from outputs of both object detection and pose estimation. The motivation behind generating redundant candidates is that object detection and pose estimation can complement each other in tracking scenes. In order to get optimal tracking objects from candidates, we present Soft-Pose-NMS. For similarity calculation, we design a Dual Self-Attention Network (DSAN) with the self-attention mechanism. The network generates the self-attention map that enables the network to focus on the object area of detection and tracklet images. Simultaneously, the network can extract the temporal self-attention feature map to suppress noisy images in the tracklet. Experiments are conducted on the MOT benchmark datasets. Results show that our tracker achieves competitive results and is state-of-the-art in half of the metrics.
KW - Dual self-attention network
KW - Multi-object tracking
KW - Person re-identification
UR - https://www.scopus.com/pages/publications/85119265840
U2 - 10.1007/978-3-030-89370-5_17
DO - 10.1007/978-3-030-89370-5_17
M3 - 会议稿件
AN - SCOPUS:85119265840
SN - 9783030893699
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 223
EP - 235
BT - PRICAI 2021
A2 - Pham, Duc Nghia
A2 - Theeramunkong, Thanaruk
A2 - Governatori, Guido
A2 - Liu, Fenrong
PB - Springer Science and Business Media Deutschland GmbH
T2 - 18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021
Y2 - 8 November 2021 through 12 November 2021
ER -