跳到主要导航 跳到搜索 跳到主要内容

Cross-Modal Attention Guided Enhanced Fusion Network for RGB-T Tracking

  • Jun Liu
  • , Wei Ke*
  • , Shuai Wang
  • , Da Yang
  • , Hao Sheng*
  • *此作品的通讯作者

科研成果: 期刊稿件文章同行评审

摘要

Visual tracking that combines RGB and thermal infrared modalities (RGB-T) aims to utilize the useful information of each modality to achieve more robust object localization. Most existing tracking methods based on convolutional neural networks (CNNs) and Transformers emphasize integrating multi-modal features through cross-modal attention, but ignore the potential exploitability of complementary information learned by cross-modal attention for enhancing modal features. In this paper, we propose a novel hierarchical progressive fusion network based on cross-modal attention guided enhancement for RGB-T tracking. Specifically, the complementary information generated by cross-modal attention implicitly reflects the consistent regions of interest of important information between different modalities, which is used to enhance modal features in a targeted manner. In addition, a modal feature refinement module and a fusion module are designed based on dynamic routing to perform noise suppression and adaptive integration on the enhanced multi-modal features. Extensive experiments on GTOT, RGBT234, LasHeR and VTUAV show that our method has competitive performance compared with recent state-of-the-art methods.

源语言英语
页(从-至)276-280
页数5
期刊IEEE Signal Processing Letters
33
DOI
出版状态已出版 - 11月 2025

指纹

探究 'Cross-Modal Attention Guided Enhanced Fusion Network for RGB-T Tracking' 的科研主题。它们共同构成独一无二的指纹。

引用此