TY - GEN
T1 - Monocular Dense SLAM with Consistent Deep Depth Prediction
AU - Yan, Feihu
AU - Wen, Jiawei
AU - Li, Zhaoxin
AU - Zhou, Zhong
N1 - Publisher Copyright:
© 2021, Springer Nature Switzerland AG.
PY - 2021
Y1 - 2021
N2 - Monocular simultaneous localization and mapping (SLAM) that using a single moving camera for motion tracking and 3D scene structure reconstruction, is an essential task for many applications, such as vision-based robotic navigation and augmented reality (AR). However, most existing methods can only recover sparse or semi-dense point clouds, which are not adequate for many high-level tasks like obstacle avoidance. Meanwhile, the state-of-the-art methods use multi-view stereo to recover the depth, which is sensitive to the low-textured and non-Lambertian surface. In this work, we propose a novel dense mapping method for monocular SLAM by integrating deep depth prediction. More specifically, a classic feature-based SLAM framework is first used to track camera poses in real-time. Then an unsupervised deep neural network for monocular depth prediction is introduced to estimate dense depth maps for selected keyframes. By incorporating a joint optimization method, predicted depth maps are refined and used to generate local dense submaps. Finally, contiguous submaps are fused with the ego-motion constraint to construct the globally consistent dense map. Extensive experiments on the KITTI dataset demonstrate that the proposed method can remarkably improve the completeness of dense reconstruction in near real-time.
AB - Monocular simultaneous localization and mapping (SLAM) that using a single moving camera for motion tracking and 3D scene structure reconstruction, is an essential task for many applications, such as vision-based robotic navigation and augmented reality (AR). However, most existing methods can only recover sparse or semi-dense point clouds, which are not adequate for many high-level tasks like obstacle avoidance. Meanwhile, the state-of-the-art methods use multi-view stereo to recover the depth, which is sensitive to the low-textured and non-Lambertian surface. In this work, we propose a novel dense mapping method for monocular SLAM by integrating deep depth prediction. More specifically, a classic feature-based SLAM framework is first used to track camera poses in real-time. Then an unsupervised deep neural network for monocular depth prediction is introduced to estimate dense depth maps for selected keyframes. By incorporating a joint optimization method, predicted depth maps are refined and used to generate local dense submaps. Finally, contiguous submaps are fused with the ego-motion constraint to construct the globally consistent dense map. Extensive experiments on the KITTI dataset demonstrate that the proposed method can remarkably improve the completeness of dense reconstruction in near real-time.
KW - Dense mapping
KW - Monocular depth prediction
KW - Visual SLAM
UR - https://www.scopus.com/pages/publications/85118324589
U2 - 10.1007/978-3-030-89029-2_9
DO - 10.1007/978-3-030-89029-2_9
M3 - 会议稿件
AN - SCOPUS:85118324589
SN - 9783030890285
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 113
EP - 124
BT - Advances in Computer Graphics - 38th Computer Graphics International Conference, CGI 2021, Proceedings
A2 - Magnenat-Thalmann, Nadia
A2 - Magnenat-Thalmann, Nadia
A2 - Interrante, Victoria
A2 - Thalmann, Daniel
A2 - Papagiannakis, George
A2 - Sheng, Bin
A2 - Kim, Jinman
A2 - Gavrilova, Marina
PB - Springer Science and Business Media Deutschland GmbH
T2 - 38th Computer Graphics International Conference, CGI 2021
Y2 - 6 September 2021 through 10 September 2021
ER -