TY - JOUR
T1 - Dual-Path CNN-BiLSTM for mmWave-Based Human Skeletal Pose Estimation
AU - He, Yuqiang
AU - Wang, Jun
AU - Li, Yaxin
AU - Luo, Yuquan
N1 - Publisher Copyright:
© 2001-2012 IEEE.
PY - 2025
Y1 - 2025
N2 - In this article, we introduce a novel method for human skeletal joint localization using millimeter-wave (mmWave) radar, effectively overcoming the limitations of vision-based pose estimation methods, which are vulnerable to changes in lighting conditions and pose privacy concerns. The method leverages mmWave radar to generate 4-D time-series point cloud data, which is then projected onto the depth-azimuth and depth-elevation planes. This projection helps mitigate the sparsity inherent in traditional point cloud data and reduces the complexity of the machine learning model required for pose estimation. The input data structure is optimized using a sliding window technique, where consecutive frames are processed by a convolutional neural network (CNN) to extract spatial features. These features are then sorted chronologically and fed into a bi-directional long short-term memory (BiLSTM) to capture temporal features, resulting in the accurate localization of 25 skeletal joints. To validate the performance and effectiveness of the proposed method, we created a dataset comprising three body types and ten distinct actions. The experimental results demonstrate the method's outstanding human pose estimation capability.
AB - In this article, we introduce a novel method for human skeletal joint localization using millimeter-wave (mmWave) radar, effectively overcoming the limitations of vision-based pose estimation methods, which are vulnerable to changes in lighting conditions and pose privacy concerns. The method leverages mmWave radar to generate 4-D time-series point cloud data, which is then projected onto the depth-azimuth and depth-elevation planes. This projection helps mitigate the sparsity inherent in traditional point cloud data and reduces the complexity of the machine learning model required for pose estimation. The input data structure is optimized using a sliding window technique, where consecutive frames are processed by a convolutional neural network (CNN) to extract spatial features. These features are then sorted chronologically and fed into a bi-directional long short-term memory (BiLSTM) to capture temporal features, resulting in the accurate localization of 25 skeletal joints. To validate the performance and effectiveness of the proposed method, we created a dataset comprising three body types and ten distinct actions. The experimental results demonstrate the method's outstanding human pose estimation capability.
KW - Bi-directional long short-term memory (BiLSTM)
KW - convolutional neural network (CNN)
KW - human skeletal pose estimation
KW - millimeter-wave (mmWave) radar
KW - point cloud
UR - https://www.scopus.com/pages/publications/105002571389
U2 - 10.1109/JSEN.2025.3543343
DO - 10.1109/JSEN.2025.3543343
M3 - 文章
AN - SCOPUS:105002571389
SN - 1530-437X
VL - 25
SP - 11683
EP - 11696
JO - IEEE Sensors Journal
JF - IEEE Sensors Journal
IS - 7
ER -