TY - JOUR
T1 - Multibranch Spatial-Channel Attention for Semantic Labeling of Very High-Resolution Remote Sensing Images
AU - Han, Bingnan
AU - Yin, Jihao
AU - Luo, Xiaoyan
AU - Jia, Xiuping
N1 - Publisher Copyright:
© 2004-2012 IEEE.
PY - 2021/12/1
Y1 - 2021/12/1
N2 - Very high-resolution (VHR) remote sensing images can provide fine but sometimes trivial ground object details; thus, the semantic labeling of VHR images is a challenging task. To improve the VHR labeling performance, spatial multiscale information and channel attention have been employed recently. However, the exploitation of global object features is still limited, which leads to the loss of capturing within-class variation from location to location. In this letter, we present a multibranch spatial-channel attention (MSCA) model to efficiently extract global dependency and combine it with multiscale and channel attention methods. In the spatial multiscale attention block, a multibranch feature fusion model is established to exploit the global relationship captured by self-attention and the multiscale correlation learned from dilated convolutions. To alleviate the computational cost of pixel-by-pixel self-attention operation, a spatial pyramid compressing method is also designed. In the channel attention block, average and max global pooling strategies are applied, respectively, in two channel attention branches to generalize global information from different perspectives. Those two blocks are then adaptively united by learnable weighting parameters. Experiments on two VHR image data sets demonstrate that the proposed network can yield better performance in comparison with state-of-the-art labeling methods tested.
AB - Very high-resolution (VHR) remote sensing images can provide fine but sometimes trivial ground object details; thus, the semantic labeling of VHR images is a challenging task. To improve the VHR labeling performance, spatial multiscale information and channel attention have been employed recently. However, the exploitation of global object features is still limited, which leads to the loss of capturing within-class variation from location to location. In this letter, we present a multibranch spatial-channel attention (MSCA) model to efficiently extract global dependency and combine it with multiscale and channel attention methods. In the spatial multiscale attention block, a multibranch feature fusion model is established to exploit the global relationship captured by self-attention and the multiscale correlation learned from dilated convolutions. To alleviate the computational cost of pixel-by-pixel self-attention operation, a spatial pyramid compressing method is also designed. In the channel attention block, average and max global pooling strategies are applied, respectively, in two channel attention branches to generalize global information from different perspectives. Those two blocks are then adaptively united by learnable weighting parameters. Experiments on two VHR image data sets demonstrate that the proposed network can yield better performance in comparison with state-of-the-art labeling methods tested.
KW - Attention mechanism
KW - convolutional neural networks (CNNs)
KW - remote sensing (RS)
KW - semantic labeling
KW - very high-resolution (VHR) images
UR - https://www.scopus.com/pages/publications/85120400214
U2 - 10.1109/LGRS.2020.3013253
DO - 10.1109/LGRS.2020.3013253
M3 - 文章
AN - SCOPUS:85120400214
SN - 1545-598X
VL - 18
SP - 2167
EP - 2171
JO - IEEE Geoscience and Remote Sensing Letters
JF - IEEE Geoscience and Remote Sensing Letters
IS - 12
ER -