TransSTC: transformer tracker meets efficient spatial-temporal cues

Research output: Contribution to journalArticlepeer-review

Abstract

Recently, researchers have started developing trackers using the powerful global modeling capabilities of transformer networks. However, existing transformer trackers usually model all template spatial cues indiscriminately and ignore temporal cues of target state changes. This distracts the tracker's attention and gradually fails to understand the target's latest state. Therefore, we propose a new tracker called TransSTC, which explores the effective spatial cues in the template and temporal cues during tracking to improve the tracker's performance. Specifically, we design the target-aware focused coding network to emphasize the efficient spatial cues in the templates, alleviating the impact of spatial cues with low associations of targets in templates on the tracker's localization accuracy. Additionally, we employ the multi-temporal template update structure that accurately captures variations in the target's appearance. Within this structure, the collected samples are assessed for target appearance similarity and environmental interference, followed by a three-level sample selection process to ensure the accurate template update. Finally, we introduce the motion constraint framework to dynamically adjust the classification results based on the target's historical motion trajectory. Extensive experimental results on seven tracking benchmarks demonstrate that TransSTC achieves competitive tracking performance.

Original languageEnglish
Article number112303
JournalPattern Recognition
Volume172
DOIs
StatePublished - Apr 2026

Keywords

  • Motion constraint
  • Object tracking
  • Spatial-temporal cues
  • Transformer tracker

Fingerprint

Dive into the research topics of 'TransSTC: transformer tracker meets efficient spatial-temporal cues'. Together they form a unique fingerprint.

Cite this