Skip to main navigation Skip to search Skip to main content

A Spatiotemporal Attention Network for mmWave-Based 3-D Human Skeleton Estimation

  • Yuquan Luo
  • , Xiangqian Li
  • , Yaxin Li*
  • , Song Liang
  • , Changshun Yuan
  • , Jun Wang
  • *Corresponding author for this work

Research output: Contribution to journalArticlepeer-review

Abstract

This article presents a spatiotemporal attention framework for radar-based human pose estimation (STAR-Pose), a unified model for estimating 3-D skeletal coordinates directly from millimeter-wave (mmWave) radar point clouds. The framework integrates PointNet++ for spatial feature extraction and a bidirectional long short-term memory (BiLSTM) module enhanced by attention pooling for temporal modeling. Data were collected using a selfdeveloped BHYY_MMW6044 radar operating in the 59–64-GHz band, capturing dynamic human motion in diverse scenarios. Comprehensive experiments demonstrate that STAR-Pose achieves an average localization error of 1.76 cm, outperforming existing radar-based baselines. The framework exhibits strong robustness to noisy frames, varying motion speeds, and cross-subject conditions, while maintaining stable accuracy under occlusion and multipath interference. Overall, STAR-Pose provides a reliable and privacy-preserving approach for human pose estimation with mmWave radar, paving the way for intelligent sensing applications in smart healthcare, ambient monitoring, and human–computer interaction.

Original languageEnglish
Pages (from-to)44363-44377
Number of pages15
JournalIEEE Sensors Journal
Volume25
Issue number24
DOIs
StatePublished - 15 Dec 2025

UN SDGs

This output contributes to the following UN Sustainable Development Goals (SDGs)

  1. SDG 3 - Good Health and Well-being
    SDG 3 Good Health and Well-being

Keywords

  • Attention mechanism
  • PointNet++
  • bidirectional long short-term memory (BiLSTM)
  • human pose estimation
  • millimeter-wave (mmWave) radar
  • point cloud

Fingerprint

Dive into the research topics of 'A Spatiotemporal Attention Network for mmWave-Based 3-D Human Skeleton Estimation'. Together they form a unique fingerprint.

Cite this