摘要
Semantic segmentation of high-resolution aerial images is a challenging task on account of complex scene variations and large-scale differences. However, these two issues are inadequately addressed in general semantic segmentation methods. In this article, we propose a multiscale prototype contrast network (MPCNet) to improve the adaptive capability for different scenes and scales. Specifically, a novel multiscale prototype transformer decoder (MPTD) is designed to extract dynamic scene-specific prototypes as pixel classifiers by fusing information from feature maps and learnable class tokens. To exploit cross-scene context information and accommodate the large-scale difference in the aerial image, we build a multiscale prototype memory queue to store these multiscale prototypes during training. Upon the multiscale prototype memory queue, a novel multiscale prototype contrastive loss is proposed to increase object feature discriminability across multiple scales, which brings better consistency of intermediate features and boosts the convergence of the network. Extensive experimental results on three publicly available datasets demonstrate the effectiveness and efficiency of our MPCNet over other state-of-the-art methods. The code is available at https://github.com/qixiong-wang/mmsegmentation-mpcnet.
| 源语言 | 英语 |
|---|---|
| 文章编号 | 5615114 |
| 期刊 | IEEE Transactions on Geoscience and Remote Sensing |
| 卷 | 61 |
| DOI | |
| 出版状态 | 已出版 - 2023 |
指纹
探究 'Multiscale Prototype Contrast Network for High-Resolution Aerial Imagery Semantic Segmentation' 的科研主题。它们共同构成独一无二的指纹。引用此
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver