FusionTrack: Multiple Object Tracking with Enhanced Information Utilization

  • Yifan Yang
  • , Ziqi He
  • , Jiaxu Wan
  • , Ding Yuan
  • , Hanyang Liu
  • , Xuliang Li
  • , Hong Zhang*
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Multi-object tracking (MOT) is one of the significant directions of computer vision. Though existing methods can solve simple tasks like pedestrian tracking well, some complex downstream tasks featuring uniform appearance and diverse motion remain difficult. Inspired by DETR, the tracking-by-attention (TBA) method uses transformers to accomplish multi-object tracking tasks. However, there are still issues with existing TBA methods within the TBA paradigm, such as difficulty detecting and tracking objects due to gradient conflict in shared parameters, and insufficient use of features to distinguish similar objects. We introduce FusionTrack to address these issues. It utilizes a joint track-detection decoder and a score-guided multi-level query fuser to enhance the usage of information within and between frames. With these improvements, FusionTrack achieves 11.1% higher by HOTA metric on the DanceTrack dataset compared with the baseline model MOTR.

Original languageEnglish
Article number8010
JournalApplied Sciences (Switzerland)
Volume13
Issue number14
DOIs
StatePublished - Jul 2023

Keywords

  • computer vision
  • multiple-object tracking
  • object detection
  • transformer

Fingerprint

Dive into the research topics of 'FusionTrack: Multiple Object Tracking with Enhanced Information Utilization'. Together they form a unique fingerprint.

Cite this