TY - GEN
T1 - FidelityBEV
T2 - 2nd International Conference on Intelligent Perception and Pattern Recognition, IPPR 2025
AU - Li, Shiong
AU - Chen, Peng
AU - Zhang, Wei
AU - Zhang, Junjie
AU - Xing, Yunzhe
AU - Lu, Xiao
N1 - Publisher Copyright:
© 2025 IEEE.
PY - 2025
Y1 - 2025
N2 - Monocular Bird's-Eye View (BEV) semantic segmentation is a critical task in autonomous driving perception[1], with its performance bottleneck residing in the view transformation pipeline. This paper systematically demonstrates that prevailing methods suffer from a cascading fidelity loss, where information is distorted at the semantic, geometric, and structural levels during the transformation process. To address this issue, we propose FidelityBEV, a novel network designed for end-to-end fidelity preservation. This network rectifies the information flow through three synergistic modules: (1) a Semantic-Structural Synergy Module (S3 Module) to enhance the semantic fidelity of source information; (2) an Uncertainty Gating Unit (UGU) to preserve geometric fidelity under uncertainty; and (3) a Vertical Context Aggregator (VCA) to ensure structural fidelity during the projection process. On the KITTI-360 benchmark, FidelityBEV achieves 41.66% in mean Intersection over Union (mIoU), marking a substantial improvement of 6.43 percentage points over the baseline.
AB - Monocular Bird's-Eye View (BEV) semantic segmentation is a critical task in autonomous driving perception[1], with its performance bottleneck residing in the view transformation pipeline. This paper systematically demonstrates that prevailing methods suffer from a cascading fidelity loss, where information is distorted at the semantic, geometric, and structural levels during the transformation process. To address this issue, we propose FidelityBEV, a novel network designed for end-to-end fidelity preservation. This network rectifies the information flow through three synergistic modules: (1) a Semantic-Structural Synergy Module (S3 Module) to enhance the semantic fidelity of source information; (2) an Uncertainty Gating Unit (UGU) to preserve geometric fidelity under uncertainty; and (3) a Vertical Context Aggregator (VCA) to ensure structural fidelity during the projection process. On the KITTI-360 benchmark, FidelityBEV achieves 41.66% in mean Intersection over Union (mIoU), marking a substantial improvement of 6.43 percentage points over the baseline.
KW - Bird's-Eye View Perception
KW - Feature Fusion
KW - Information Fidelity
KW - Monocular 3D Perception
KW - Semantic Segmentation
KW - Structure-Aware Representation
KW - Uncertainty Modeling
UR - https://www.scopus.com/pages/publications/105022474980
U2 - 10.1109/IPPR66507.2025.11198421
DO - 10.1109/IPPR66507.2025.11198421
M3 - 会议稿件
AN - SCOPUS:105022474980
T3 - 2025 2nd International Conference on Intelligent Perception and Pattern Recognition, IPPR 2025
SP - 44
EP - 50
BT - 2025 2nd International Conference on Intelligent Perception and Pattern Recognition, IPPR 2025
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 15 August 2025 through 17 August 2025
ER -