TY - JOUR
T1 - Efficient Semantic Splatting for Remote Sensing Multiview Segmentation
AU - Qi, Zipeng
AU - Chen, Hao
AU - Zhang, Haotian
AU - Zou, Zhengxia
AU - Shi, Zhenwei
N1 - Publisher Copyright:
© 1980-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - Remote sensing multiview image segmentation is essential for achieving accurate and consistent stereoscopic perception of target scenes. This task involves processing RGB images from multiple viewpoints to generate high-accuracy, view-consistent semantic segmentation across all views. Traditional training-based methods struggle with maintaining cross-view consistency, while optimization-driven approaches using implicit neural networks improve view consistency but suffer from slow parameter optimization and inference. To overcome these limitations, we propose a novel Gaussian splatting-based semantic segmentation framework. Our method efficiently projects the color attributes and semantic features of 3-D Gaussians onto the image plane, enabling the simultaneous generation of both RGB images and segmentation outputs. By leveraging explicit spatial structures and a splatting rendering strategy, our approach significantly enhances optimization efficiency and rendering speed. In addition, we incorporate SAM2 to generate pseudo-labels for boundary regions, addressing the lack of supervision in sparsely labeled views (e.g., 3%). To further enforce cross-view consistency and feature coherence of 3-D Gaussians, we introduce a two-level aggregation loss that operates at both the 2-D feature map and 3-D spatial levels. Extensive experiments across nine datasets demonstrate the superiority of our method, achieving competitive segmentation quality with limited supervisory views. Notably, our approach reduces rendering (inference) times by 90%, while improving the average mean intersection over union (mIoU) by up to 3.5%.
AB - Remote sensing multiview image segmentation is essential for achieving accurate and consistent stereoscopic perception of target scenes. This task involves processing RGB images from multiple viewpoints to generate high-accuracy, view-consistent semantic segmentation across all views. Traditional training-based methods struggle with maintaining cross-view consistency, while optimization-driven approaches using implicit neural networks improve view consistency but suffer from slow parameter optimization and inference. To overcome these limitations, we propose a novel Gaussian splatting-based semantic segmentation framework. Our method efficiently projects the color attributes and semantic features of 3-D Gaussians onto the image plane, enabling the simultaneous generation of both RGB images and segmentation outputs. By leveraging explicit spatial structures and a splatting rendering strategy, our approach significantly enhances optimization efficiency and rendering speed. In addition, we incorporate SAM2 to generate pseudo-labels for boundary regions, addressing the lack of supervision in sparsely labeled views (e.g., 3%). To further enforce cross-view consistency and feature coherence of 3-D Gaussians, we introduce a two-level aggregation loss that operates at both the 2-D feature map and 3-D spatial levels. Extensive experiments across nine datasets demonstrate the superiority of our method, achieving competitive segmentation quality with limited supervisory views. Notably, our approach reduces rendering (inference) times by 90%, while improving the average mean intersection over union (mIoU) by up to 3.5%.
KW - Gaussian splatting
KW - SAM2
KW - remote sensing
KW - semantic segmentation
UR - https://www.scopus.com/pages/publications/105002664682
U2 - 10.1109/TGRS.2025.3558217
DO - 10.1109/TGRS.2025.3558217
M3 - 文章
AN - SCOPUS:105002664682
SN - 0196-2892
VL - 63
JO - IEEE Transactions on Geoscience and Remote Sensing
JF - IEEE Transactions on Geoscience and Remote Sensing
M1 - 0b00006493cb74c0
ER -