Skip to main navigation Skip to search Skip to main content

CAF-ViT: A cross-attention based Transformer network for underwater acoustic target recognition

  • Wenfeng Dong
  • , Jin Fu*
  • , Nan Zou
  • , Chunpeng Zhao
  • , Yixin Miao
  • , Zheng Shen
  • *Corresponding author for this work
  • Harbin Engineering University

Research output: Contribution to journalArticlepeer-review

Abstract

Recognizing underwater acoustic targets is challenging due to the complex characteristics of acoustic sources and channels. The limited perspective of information further complicates this task. This study addresses the issue of insufficient accuracy by proposing a novel fusion recognition algorithm CAF-ViT, which maps multiple time-frequency representation features to category outputs. We propose an improved Vision Transformer, 1D-ViT, to enhance self-attention feature extraction from LOFAR, Mel spectrum and wavelet packets, resulting in a 5.1%, 4.7%, and 6.2% increase in category prediction accuracy, respectively. Additionally, a two-stage fusion framework is presented, involving a feature fusion module that fuses feature pairs based on cross-attention mechanism and a decision fusion module that determines the final category prediction based on confidence weighting. Experimental results demonstrate that our method outperforms other comparison algorithms, achieving the best recognition accuracies of 83.7% on the “DeepShip” dataset. Additionally, ablation experiments provide further analysis of the contribution of each module to the overall enhancement of the proposed method.

Original languageEnglish
Article number120049
JournalOcean Engineering
Volume318
DOIs
StatePublished - 15 Feb 2025

Keywords

  • Cross-attention
  • Feature fusion
  • Self-attention
  • Underwater acoustic target recognition
  • Vision transformer

Fingerprint

Dive into the research topics of 'CAF-ViT: A cross-attention based Transformer network for underwater acoustic target recognition'. Together they form a unique fingerprint.

Cite this