TY - GEN
T1 - Two-hand Pose Estimation from the non-cropped RGB Image with Self-Attention Based Network
AU - Sun, Zhoutao
AU - Hu, Yong
AU - Shen, Xukun
N1 - Publisher Copyright:
© 2021 IEEE.
PY - 2021
Y1 - 2021
N2 - Estimating the pose of two hands is a crucial problem for many human-computer interaction applications. Since most of the existing works utilize cropped images to predict the hand pose, they require a hand detection stage before pose estimation or input cropped images directly. In this paper, we propose the first real-time one-stage method for pose estimation from a single RGB image without hand tracking. Combining the self-attention mechanism with convolutional layers, the network we proposed is able to predict the 2.5D hand joints coordinate while locating the two hands regions. And to reduce the extra memory and computational consumption caused by self-attention, we proposed a linear attention structure with a spatial-reduction attention block called SRAN block. We demonstrate the effectiveness of each component in our network through the ablation study. And experiments on public datasets showed the competitive result with the state-of-the-art method.
AB - Estimating the pose of two hands is a crucial problem for many human-computer interaction applications. Since most of the existing works utilize cropped images to predict the hand pose, they require a hand detection stage before pose estimation or input cropped images directly. In this paper, we propose the first real-time one-stage method for pose estimation from a single RGB image without hand tracking. Combining the self-attention mechanism with convolutional layers, the network we proposed is able to predict the 2.5D hand joints coordinate while locating the two hands regions. And to reduce the extra memory and computational consumption caused by self-attention, we proposed a linear attention structure with a spatial-reduction attention block called SRAN block. We demonstrate the effectiveness of each component in our network through the ablation study. And experiments on public datasets showed the competitive result with the state-of-the-art method.
KW - Artificial intelligence
KW - Computer version
KW - Human computer interaction(HCI)
KW - Interaction paradigms
KW - Mixed / augmented reality
KW - Pose estimation
UR - https://www.scopus.com/pages/publications/85126393139
U2 - 10.1109/ISMAR52148.2021.00040
DO - 10.1109/ISMAR52148.2021.00040
M3 - 会议稿件
AN - SCOPUS:85126393139
T3 - Proceedings - 2021 IEEE International Symposium on Mixed and Augmented Reality, ISMAR 2021
SP - 248
EP - 255
BT - Proceedings - 2021 IEEE International Symposium on Mixed and Augmented Reality, ISMAR 2021
A2 - Marchal, Maud
A2 - Ventura, Jonathan
A2 - Olivier, Anne-Helene
A2 - Wang, Lili
A2 - Radkowski, Rafael
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 20th IEEE International Symposium on Mixed and Augmented Reality, ISMAR 2021
Y2 - 4 October 2021 through 8 October 2021
ER -