Abstract
Robust spatial perception is essential for SLAM in robotics and autonomous systems, but existing pipelines often fail in structure-deficient scenes when relying on a single modality or decoupling depth estimation from SLAM. We present a joint depth-enhanced, multi-model SLAM system tailored for such scenarios with three core contributions: First, we propose a multi-model depth fusion framework (MDFF) that fuses visual, LiDAR, inertial, and learned depth cues; Second, we design a dense scan-to-map module (DSM) within the LiDAR–Inertial Subsystem (LIS) that eliminates handcrafted features; Third, we develop a depth-aware backend optimization (DBO) that jointly refines poses, landmarks, and scale using multi and single-view depth constraints. The system targets high-throughput computing, with embarrassingly parallel per-point residuals and GPU-ready depth inference. Experiments show that DSM reduces LiDAR-inertial processing time versus LVI-SAM while the full pipeline runs in real time (21.5 FPS LiDAR, 28.6 FPS camera) and delivers higher localization accuracy than representative baselines.
| Original language | English |
|---|---|
| Article number | 306 |
| Journal | Journal of Supercomputing |
| Volume | 82 |
| Issue number | 5 |
| DOIs | |
| State | Published - Apr 2026 |
Keywords
- Depth estimation
- Multi-model
- SLAM
- Structure-degraded environments
Fingerprint
Dive into the research topics of 'Joint depth estimation and multi-model SLAM for robust perception in structure-degraded environments'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver