Skip to main navigation Skip to search Skip to main content

AHRNET: ATTENTION AND HEATMAP-BASED REGRESSOR FOR HAND POSE ESTIMATION AND MESH RECOVERY

  • Feng Zhou
  • , Pei Shen
  • , Ju Dai*
  • , Na Jiang
  • , Yong Hu
  • , Yu Kun Lai
  • , Paul L. Rosin
  • *Corresponding author for this work
  • North China University of Technology
  • Peng Cheng Laboratory
  • Capital Normal University
  • Cardiff University

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Estimating 3D hand pose and recovering the full hand surface mesh from a single RGB image is a challenging task due to self-occlusions, viewpoint changes, and the complexity of hand articulations. In this paper, we propose a novel framework that combines an attention mechanism with heatmap regression to accurately and efficiently predict 3D joint locations and reconstruct the hand mesh. We adopt a pooling attention module that learns to focus on relevant regions in the input image to extract better features for handling occlusions, while greatly reducing the computational cost. The multi-scale 2D heatmaps provide spatial constraints to guide the 3D vertex predictions. By exploiting the complementary strengths of sparse 2D supervision and dense mesh regression, our method accurately reconstructs hand meshes with realistic details. Extensive experiments on standard benchmarks demonstrate that the proposed method efficiently improves the performance of 3D hand pose estimation and mesh recovery. The reproducible recipes are available at https://github.com/SDiannn/AHRNET-Heatmap.

Original languageEnglish
Title of host publication2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Proceedings
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages3000-3004
Number of pages5
ISBN (Electronic)9798350344851
DOIs
StatePublished - 2024
Event2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024 - Seoul, Korea, Republic of
Duration: 14 Apr 202419 Apr 2024

Publication series

NameICASSP, IEEE International Conference on Acoustics, Speech and Signal Processing - Proceedings
ISSN (Print)1520-6149

Conference

Conference2024 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2024
Country/TerritoryKorea, Republic of
CitySeoul
Period14/04/2419/04/24

Keywords

  • Deep Learning
  • Hand Pose
  • Heatmap
  • Human-computer Interaction
  • Mesh Recovery

Fingerprint

Dive into the research topics of 'AHRNET: ATTENTION AND HEATMAP-BASED REGRESSOR FOR HAND POSE ESTIMATION AND MESH RECOVERY'. Together they form a unique fingerprint.

Cite this