Visual tracking based on depth cross-correlation and feature alignment

  • Guang Han*
  • , Yao Xiao
  • , Fuxiang Wang
  • , Xuhui Liu
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

Visual tracking technology based on the Siamese network have enabled excellent performance on many tracking datasets. However, these trackers cannot provide desirable results in unconstrained environments, such as fast motion and extensive scale variations. To solve this problem, this paper proposes Adaptive Dilated Fusion module, Depth Pixel-Wise Correlation module and Feature Alignment module to meet the above challenges. Adaptive Dilated Fusion module facilitates extensive scale variations by adding receptive field pyramid on the last layer of Siamese network; Depth Pixel-Wise Correlation module aims to extract pixel level features through average pooling and maximum pooling and reduce the influence of background noise; Feature Alignment module alleviates the mismatch between classification task and regression task. Experiments are performed on several public datasets VOT2017, OTB100, LaSOT, etc. The tracking performance of algorithm is tested on complex scenes such as fast motion, various resolutions and extensive scale variations. On the OTB100 dataset, the tracker proposed in this paper (named SiamAPA) obtains up 2.4% (AUC) compared with the reference network on fast motion scene, 4.9% on various resolution scene and 1.3% on extensive scale variations scene. On the VOT2017 dataset, SiamAPA obtains up 3.7% (EAO) compared with the reference network. On the LaSOT dataset, the accuracy is improved by 1% and the robustness is improved by 1.9% compared with the reference network. Thanks to the coordination of the above three innovations, the proposed algorithm is superior to classical algorithms such as SPM tracker in many datasets while performs real-time tracking effect.

Original languageEnglish
Pages (from-to)37-47
Number of pages11
JournalJournal of Signal Processing Systems
Volume95
Issue number1
DOIs
StatePublished - Jan 2023
Externally publishedYes

Keywords

  • Adaptive Dilated Fusion
  • Depth pixel-wise correlation
  • Feature alignment
  • Object tracking
  • Siamese network

Fingerprint

Dive into the research topics of 'Visual tracking based on depth cross-correlation and feature alignment'. Together they form a unique fingerprint.

Cite this