TY - GEN
T1 - FusionPainting
T2 - 2021 IEEE International Intelligent Transportation Systems Conference, ITSC 2021
AU - Xu, Shaoqing
AU - Zhou, Dingfu
AU - Fang, Jin
AU - Yin, Junbo
AU - Bin, Zhou
AU - Zhang, Liangjun
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021/9/19
Y1 - 2021/9/19
N2 - Accurate detection of obstacles in 3D is an essential task for autonomous driving and intelligent transportation. In this work, we propose a general multimodal fusion framework FusionPainting to fuse the 2D RGB image and 3D point clouds at a semantic level for boosting the 3D object detection task. Especially, the FusionPainting framework consists of three main modules: a multi-modal semantic segmentation module, an adaptive attention-based semantic fusion module, and a 3D object detector. First, semantic information is obtained for 2D image and 3D Lidar point clouds based on 2D and 3D segmentation approaches. Then the segmentation results from different sensors are adaptively fused based on the proposed attention-based semantic fusion module. Finally, the point clouds painted with the fused semantic label are sent to the 3D detector for obtaining the 3D objection results. The effectiveness of the proposed framework has been verified on the large-scale nuScenes detection benchmark by comparing with three different baselines. The experimental results show that the fusion strategy can significantly improve the detection performance compared to the methods using only point clouds, and the methods using point clouds only painted with 2D segmentation information. Furthermore, the proposed approach outperforms other state-of-the-art methods on the nuScenes testing benchmark. Code will be available at https://github.com/Shaoqing26/FusionPainting/.
AB - Accurate detection of obstacles in 3D is an essential task for autonomous driving and intelligent transportation. In this work, we propose a general multimodal fusion framework FusionPainting to fuse the 2D RGB image and 3D point clouds at a semantic level for boosting the 3D object detection task. Especially, the FusionPainting framework consists of three main modules: a multi-modal semantic segmentation module, an adaptive attention-based semantic fusion module, and a 3D object detector. First, semantic information is obtained for 2D image and 3D Lidar point clouds based on 2D and 3D segmentation approaches. Then the segmentation results from different sensors are adaptively fused based on the proposed attention-based semantic fusion module. Finally, the point clouds painted with the fused semantic label are sent to the 3D detector for obtaining the 3D objection results. The effectiveness of the proposed framework has been verified on the large-scale nuScenes detection benchmark by comparing with three different baselines. The experimental results show that the fusion strategy can significantly improve the detection performance compared to the methods using only point clouds, and the methods using point clouds only painted with 2D segmentation information. Furthermore, the proposed approach outperforms other state-of-the-art methods on the nuScenes testing benchmark. Code will be available at https://github.com/Shaoqing26/FusionPainting/.
UR - https://www.scopus.com/pages/publications/85118437373
U2 - 10.1109/ITSC48978.2021.9564951
DO - 10.1109/ITSC48978.2021.9564951
M3 - 会议稿件
AN - SCOPUS:85118437373
T3 - IEEE Conference on Intelligent Transportation Systems, Proceedings, ITSC
SP - 3047
EP - 3054
BT - 2021 IEEE International Intelligent Transportation Systems Conference, ITSC 2021
PB - Institute of Electrical and Electronics Engineers Inc.
Y2 - 19 September 2021 through 22 September 2021
ER -