TY - JOUR
T1 - UGV Swarm Multi-View Fusion Under Occlusion
T2 - A Graph-Based Calibration-Free Framework
AU - Jing, Jiaqi
AU - Song, Weilong
AU - Zhang, Hangcheng
AU - Liu, Yong
AU - Feng, Fuyong
AU - Zheng, Dezhi
AU - Fan, Shangchun
N1 - Publisher Copyright:
© 2026 by the authors.
PY - 2026/3
Y1 - 2026/3
N2 - Highlights: What are the main findings? A calibration-free, end-to-end graph-based framework is proposed for joint camera and subject registration in UGV swarms, operating robustly under severe occlusion and low inter-view co-visibility. A graph-based pose propagation module (GPPM) is proposed that enables global alignment via BFS-guided pose propagation along local co-visibility links, eliminating the need for full co-visibility with a root node or pre-calibrated extrinsics. What are the implications of the main findings? Despite being a purely vision-based solution, the method enables infrastructure-free perception in GPS-denied or complex environments by decoupling multi-view fusion from explicit agent pose estimation, therefore supporting dynamic deployment of UGV swarms with a central node for robust collaborative perception tasks. The method enables robust BEV scene representation and collective situational awareness for UGV swarms, even when inter-camera overlap is minimal and occlusions are severe. In unmanned ground vehicle (UGV) swarm systems, comprehensive environmental awareness is critical for coordinated operations. Yet they are frequently deployed in occlusion-rich, constrained environments where multi-agent visual fusion is essential. However, existing methods are critically limited by offline-calibrated extrinsic parameters, hindering flexible deployment, and by a strong co-visibility assumption, which fails under severe occlusion. To overcome these constraints, we introduce an end-to-end, calibration-free framework for the joint registration of cameras and subjects. Our approach begins with a single-view module that estimates subjects’ poses and appearance features. Subsequently, a novel graph-based pose propagation module (GPPM) treats UGVs’ cameras as nodes in a graph, connecting them with edges when they share co-visible subjects identified via appearance matching. Breadth-first search (BFS) then finds the shortest registration path from any camera to a designated root camera, enabling pose propagation via local co-visibility links and global alignment of all subjects into a unified bird’s-eye-view (BEV) space. This strategy relaxes the stringent requirement of full co-visibility with the root node. A multi-task loss function is proposed to jointly optimize pose estimation and feature matching. Trained and evaluated on a synthetic dataset with occlusions (CSRD-O) collected by a UGV swarm system, our framework achieves mean camera pose errors of 1.57 m/8.70° and mean subject pose errors of 1.40 m/9.14°. Furthermore, we demonstrate a scene monitoring task using a UGV swarm system. Experiments show that the proposed method generates robust BEV estimates even under severe occlusion and low inter-view overlap. This work presents a purely visual, self-calibrating multi-view fusion perception scheme, demonstrating its potential to support cooperative perception, task-oriented monitoring, and collective situational awareness in UGV swarm systems.
AB - Highlights: What are the main findings? A calibration-free, end-to-end graph-based framework is proposed for joint camera and subject registration in UGV swarms, operating robustly under severe occlusion and low inter-view co-visibility. A graph-based pose propagation module (GPPM) is proposed that enables global alignment via BFS-guided pose propagation along local co-visibility links, eliminating the need for full co-visibility with a root node or pre-calibrated extrinsics. What are the implications of the main findings? Despite being a purely vision-based solution, the method enables infrastructure-free perception in GPS-denied or complex environments by decoupling multi-view fusion from explicit agent pose estimation, therefore supporting dynamic deployment of UGV swarms with a central node for robust collaborative perception tasks. The method enables robust BEV scene representation and collective situational awareness for UGV swarms, even when inter-camera overlap is minimal and occlusions are severe. In unmanned ground vehicle (UGV) swarm systems, comprehensive environmental awareness is critical for coordinated operations. Yet they are frequently deployed in occlusion-rich, constrained environments where multi-agent visual fusion is essential. However, existing methods are critically limited by offline-calibrated extrinsic parameters, hindering flexible deployment, and by a strong co-visibility assumption, which fails under severe occlusion. To overcome these constraints, we introduce an end-to-end, calibration-free framework for the joint registration of cameras and subjects. Our approach begins with a single-view module that estimates subjects’ poses and appearance features. Subsequently, a novel graph-based pose propagation module (GPPM) treats UGVs’ cameras as nodes in a graph, connecting them with edges when they share co-visible subjects identified via appearance matching. Breadth-first search (BFS) then finds the shortest registration path from any camera to a designated root camera, enabling pose propagation via local co-visibility links and global alignment of all subjects into a unified bird’s-eye-view (BEV) space. This strategy relaxes the stringent requirement of full co-visibility with the root node. A multi-task loss function is proposed to jointly optimize pose estimation and feature matching. Trained and evaluated on a synthetic dataset with occlusions (CSRD-O) collected by a UGV swarm system, our framework achieves mean camera pose errors of 1.57 m/8.70° and mean subject pose errors of 1.40 m/9.14°. Furthermore, we demonstrate a scene monitoring task using a UGV swarm system. Experiments show that the proposed method generates robust BEV estimates even under severe occlusion and low inter-view overlap. This work presents a purely visual, self-calibrating multi-view fusion perception scheme, demonstrating its potential to support cooperative perception, task-oriented monitoring, and collective situational awareness in UGV swarm systems.
KW - bird’s-eye-view
KW - calibration-free
KW - multi-view fusion
KW - pose registration
KW - unmanned ground vehicle
UR - https://www.scopus.com/pages/publications/105034274321
U2 - 10.3390/drones10030214
DO - 10.3390/drones10030214
M3 - 文章
AN - SCOPUS:105034274321
SN - 2504-446X
VL - 10
JO - Drones
JF - Drones
IS - 3
M1 - 214
ER -