TY - JOUR
T1 - Accurate semantic segmentation of very high-resolution remote sensing images considering feature state sequences
T2 - From benchmark datasets to urban applications
AU - Wang, Zijie
AU - Yi, Jizheng
AU - Chen, Aibin
AU - Chen, Lijiang
AU - Lin, Hui
AU - Xu, Kai
N1 - Publisher Copyright:
© 2025 International Society for Photogrammetry and Remote Sensing, Inc. (ISPRS)
PY - 2025/2
Y1 - 2025/2
N2 - Very High-Resolution (VHR) urban remote sensing images segmentation is widely used in ecological environmental protection, urban dynamic monitoring, fine urban management and other related fields. However, the large-scale variation and discrete distribution of objects in VHR images presents a significant challenge to accurate segmentation. The existing studies have primarily concentrated on the internal correlations within a single features, while overlooking the inherent sequential relationships across different feature state. In this paper, a novel Urban Spatial Segmentation Framework (UrbanSSF) is proposed, which fully considers the connections between feature states at different phases. Specifically, the Feature State Interaction (FSI) Mamba with powerful sequence modeling capabilities is designed based on state space modules. It effectively facilitates interactions between the information across different features. Given the disparate semantic information and spatial details of features at different scales, a Global Semantic Enhancer (GSE) module and a Spatial Interactive Attention (SIA) mechanism are designed. The GSE module operates on the high-level features, while the SIA mechanism processes the middle and low-level features. To address the computational challenges of large-scale dense feature fusion, a Channel Space Reconstruction (CSR) algorithm is proposed. This algorithm effectively reduces the computational burden while ensuring efficient processing and maintaining accuracy. In addition, the lightweight UrbanSSF-T, the efficient UrbanSSF-S and the accurate UrbanSSF-L are designed to meet different application requirements in urban scenarios. Comprehensive experiments on the UAVid, ISPRS Vaihingen and Potsdam datasets validate the superior performance of UrbanSSF series. Especially, the UrbanSSF-L achieves a mean intersection over union of 71.0% on the UAVid dataset. Code is available at https://github.com/KotlinWang/UrbanSSF.
AB - Very High-Resolution (VHR) urban remote sensing images segmentation is widely used in ecological environmental protection, urban dynamic monitoring, fine urban management and other related fields. However, the large-scale variation and discrete distribution of objects in VHR images presents a significant challenge to accurate segmentation. The existing studies have primarily concentrated on the internal correlations within a single features, while overlooking the inherent sequential relationships across different feature state. In this paper, a novel Urban Spatial Segmentation Framework (UrbanSSF) is proposed, which fully considers the connections between feature states at different phases. Specifically, the Feature State Interaction (FSI) Mamba with powerful sequence modeling capabilities is designed based on state space modules. It effectively facilitates interactions between the information across different features. Given the disparate semantic information and spatial details of features at different scales, a Global Semantic Enhancer (GSE) module and a Spatial Interactive Attention (SIA) mechanism are designed. The GSE module operates on the high-level features, while the SIA mechanism processes the middle and low-level features. To address the computational challenges of large-scale dense feature fusion, a Channel Space Reconstruction (CSR) algorithm is proposed. This algorithm effectively reduces the computational burden while ensuring efficient processing and maintaining accuracy. In addition, the lightweight UrbanSSF-T, the efficient UrbanSSF-S and the accurate UrbanSSF-L are designed to meet different application requirements in urban scenarios. Comprehensive experiments on the UAVid, ISPRS Vaihingen and Potsdam datasets validate the superior performance of UrbanSSF series. Especially, the UrbanSSF-L achieves a mean intersection over union of 71.0% on the UAVid dataset. Code is available at https://github.com/KotlinWang/UrbanSSF.
KW - Remote sensing
KW - Semantic segmentation
KW - State space modules
KW - Urban scene
KW - Very high resolution (VHR)
UR - https://www.scopus.com/pages/publications/85216555116
U2 - 10.1016/j.isprsjprs.2025.01.017
DO - 10.1016/j.isprsjprs.2025.01.017
M3 - 文章
AN - SCOPUS:85216555116
SN - 0924-2716
VL - 220
SP - 824
EP - 840
JO - ISPRS Journal of Photogrammetry and Remote Sensing
JF - ISPRS Journal of Photogrammetry and Remote Sensing
ER -