Abstract
Recently, researchers have started developing trackers using the powerful global modeling capabilities of transformer networks. However, existing transformer trackers usually model all template spatial cues indiscriminately and ignore temporal cues of target state changes. This distracts the tracker's attention and gradually fails to understand the target's latest state. Therefore, we propose a new tracker called TransSTC, which explores the effective spatial cues in the template and temporal cues during tracking to improve the tracker's performance. Specifically, we design the target-aware focused coding network to emphasize the efficient spatial cues in the templates, alleviating the impact of spatial cues with low associations of targets in templates on the tracker's localization accuracy. Additionally, we employ the multi-temporal template update structure that accurately captures variations in the target's appearance. Within this structure, the collected samples are assessed for target appearance similarity and environmental interference, followed by a three-level sample selection process to ensure the accurate template update. Finally, we introduce the motion constraint framework to dynamically adjust the classification results based on the target's historical motion trajectory. Extensive experimental results on seven tracking benchmarks demonstrate that TransSTC achieves competitive tracking performance.
| Original language | English |
|---|---|
| Article number | 112303 |
| Journal | Pattern Recognition |
| Volume | 172 |
| DOIs | |
| State | Published - Apr 2026 |
Keywords
- Motion constraint
- Object tracking
- Spatial-temporal cues
- Transformer tracker
Fingerprint
Dive into the research topics of 'TransSTC: transformer tracker meets efficient spatial-temporal cues'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver