跳到主要导航 跳到搜索 跳到主要内容

RSBEV-Mamba: 3-D BEV Sequence Modeling for Multiview Remote Sensing Scene Segmentation

  • Beihang University
  • Shanghai Artificial Intelligence Laboratory

科研成果: 期刊稿件文章同行评审

摘要

Multiview collaborative perception has been demonstrated to be highly effective in extracting 3-D information from remote sensing scenes by remote sensing bird’s-eye-view (RSBEV). However, inherent depth uncertainty in purely visual methods limits view fusion accuracy, and high computational complexity makes it challenging to model long sequences efficiently. To address these issues, we reformulate the BEV segmentation problem as a 3-D sequence modeling task and propose RSBEV-Mamba, a novel framework comprising a 3-D BEV module, a 3-D VMamba module, and a dense BEV contrastive learning module. The 3-D BEV module projects multiview 2-D image features into 3-D world coordinates, thus establishing a foundation for accurate spatial representation. The 3-D VMamba module, based on state-space models (SSMs), optimizes the processing of densely projected features with linear computational complexity in global 3-D spatial modeling. It incorporates a 3-D selective scanning strategy (SS3D) block with 16 scanning strategies, transforming previously ignored projections at different heights into valid 3-D sequences and enriching the contextual depth and precision of BEV encoding. By employing a contrastive learning strategy with the CLIP model, we align BEV and ground truth (GT) features within the same dimensional framework, ensuring spatial integrity after side-view projection. Our approach achieves a 4% improvement mIoU, thus reaching a score of 0.7368 on LEVIR-MDS and surpassing previous state-of-the-art methods. This establishes the 3-D VMamba module as a general model for 3-D perception tasks and sets a new benchmark in remote sensing technology.

源语言英语
文章编号5613213
期刊IEEE Transactions on Geoscience and Remote Sensing
63
DOI
出版状态已出版 - 2025

指纹

探究 'RSBEV-Mamba: 3-D BEV Sequence Modeling for Multiview Remote Sensing Scene Segmentation' 的科研主题。它们共同构成独一无二的指纹。

引用此