TY - JOUR
T1 - Robust 3D Gaussian SLAM for Humanoid Robots with Visual Enhancement in IoT-enabled Dynamic Crowd Scenes
AU - Chen, Yu
AU - Wang, Tian
AU - Chen, Xuanzhen
AU - Luo, Jingwen
N1 - Publisher Copyright:
© 2014 IEEE.
PY - 2026
Y1 - 2026
N2 - In recent years, humanoid robots have gradually emerged as crucial “smart terminals” within the IoT, significantly expanding the application scenarios of IoT. This paper proposes a visual enhancement based robust 3D Gaussian SLAM (3DGS-SLAM) to improve the environmental perception ability of humanoid robots. First, the dynamic mask provided by YOLOv11 is optimized using composite morphological operations to obtain high-confidence static features, which are then employed for accurate estimation of the robot’s pose. Subsequently, a keyframe selection strategy combining scaled interval Kalman filtering (SIKF) with the motion model of a floating base humanoid robot is designed. It quantifies the uncertainty of pose by dynamically dividing the motion interval of the robot, and introduces a scaling factor to adaptively adjust the interval width, thereby filtering out high-quality keyframes to ensure stable pose tracking and efficient mapping. Further, a mapping strategy that combines Gaussian-Laplacian pyramid and Gaussian ellipsoid adaptive density control is developed by combining 3D Gaussian splashing. This method can not only identify Gaussian ellipsoids requiring segmentation by preserving high-frequency details in keyframes, but also control their splitting direction through gradient modulation, effectively reducing redundancy among Gaussian ellipsoids while enhancing mapping quality. The experimental results show that our method improves the average absolute trajectory error(ATE) on the BONN and TUM datasets by an average of 90.8% and 93.1% compared to ORB-SLAM3, significantly improving the localization accuracy and mapping consistency of humanoid robots in dynamic crowded scenes.
AB - In recent years, humanoid robots have gradually emerged as crucial “smart terminals” within the IoT, significantly expanding the application scenarios of IoT. This paper proposes a visual enhancement based robust 3D Gaussian SLAM (3DGS-SLAM) to improve the environmental perception ability of humanoid robots. First, the dynamic mask provided by YOLOv11 is optimized using composite morphological operations to obtain high-confidence static features, which are then employed for accurate estimation of the robot’s pose. Subsequently, a keyframe selection strategy combining scaled interval Kalman filtering (SIKF) with the motion model of a floating base humanoid robot is designed. It quantifies the uncertainty of pose by dynamically dividing the motion interval of the robot, and introduces a scaling factor to adaptively adjust the interval width, thereby filtering out high-quality keyframes to ensure stable pose tracking and efficient mapping. Further, a mapping strategy that combines Gaussian-Laplacian pyramid and Gaussian ellipsoid adaptive density control is developed by combining 3D Gaussian splashing. This method can not only identify Gaussian ellipsoids requiring segmentation by preserving high-frequency details in keyframes, but also control their splitting direction through gradient modulation, effectively reducing redundancy among Gaussian ellipsoids while enhancing mapping quality. The experimental results show that our method improves the average absolute trajectory error(ATE) on the BONN and TUM datasets by an average of 90.8% and 93.1% compared to ORB-SLAM3, significantly improving the localization accuracy and mapping consistency of humanoid robots in dynamic crowded scenes.
KW - 3D Gaussian splashing
KW - Gaussian-Laplacian pyramid
KW - dynamic crowded scene
KW - humanoid robot
KW - visual SLAM
UR - https://www.scopus.com/pages/publications/105031739917
U2 - 10.1109/JIOT.2026.3668278
DO - 10.1109/JIOT.2026.3668278
M3 - 文章
AN - SCOPUS:105031739917
SN - 2327-4662
JO - IEEE Internet of Things Journal
JF - IEEE Internet of Things Journal
ER -