TY - GEN
T1 - Assessing Action Quality via Attentive Spatio-Temporal Convolutional Networks
AU - Wang, Jiahao
AU - Du, Zhengyin
AU - Li, Annan
AU - Wang, Yunhong
N1 - Publisher Copyright:
© 2020, Springer Nature Switzerland AG.
PY - 2020
Y1 - 2020
N2 - Action quality assessment, which aims at evaluating the performance of specific actions, has drawn more and more attention due to its extensive demand in sports, health care, etc. Unlike action recognition, in which a few typical frames are sufficient for classification, action quality assessment requires analysis at a fine temporal granularity to discover the subtle motion difference. In this paper, we propose a novel spatio-temporal framework for action quality assessment at full-frame-rate (25fps), which consists of two steps: i.e. spatio-temporal feature extraction and temporal feature fusion, respectively. In the first step, to generate representative spatio-temporal dynamics, we utilize a spatial convolutional network (SCN) together with specially designed temporal convolutional networks (TCNs) and train them by a two-stage strategy. In the second step, we introduce an attention mechanism to fuse features in the temporal dimension according to their impact on the overall performance. Compared with existing three dimensional convolutional neural networks (3D-CNN) based methods, our model is capable of capturing more action quality relevant details. As a by-product, our model can also attend to the highlight moments in sports videos, which gives a better interpretation of the score. Extensive experiments on three public benchmarks demonstrate that the proposed method has distinct advantage in action quality assessment and achieves improvement over the state-of-the-art.
AB - Action quality assessment, which aims at evaluating the performance of specific actions, has drawn more and more attention due to its extensive demand in sports, health care, etc. Unlike action recognition, in which a few typical frames are sufficient for classification, action quality assessment requires analysis at a fine temporal granularity to discover the subtle motion difference. In this paper, we propose a novel spatio-temporal framework for action quality assessment at full-frame-rate (25fps), which consists of two steps: i.e. spatio-temporal feature extraction and temporal feature fusion, respectively. In the first step, to generate representative spatio-temporal dynamics, we utilize a spatial convolutional network (SCN) together with specially designed temporal convolutional networks (TCNs) and train them by a two-stage strategy. In the second step, we introduce an attention mechanism to fuse features in the temporal dimension according to their impact on the overall performance. Compared with existing three dimensional convolutional neural networks (3D-CNN) based methods, our model is capable of capturing more action quality relevant details. As a by-product, our model can also attend to the highlight moments in sports videos, which gives a better interpretation of the score. Extensive experiments on three public benchmarks demonstrate that the proposed method has distinct advantage in action quality assessment and achieves improvement over the state-of-the-art.
KW - Action quality assessment
KW - Attentive fusion
KW - Temporal convolution
UR - https://www.scopus.com/pages/publications/85093927003
U2 - 10.1007/978-3-030-60639-8_1
DO - 10.1007/978-3-030-60639-8_1
M3 - 会议稿件
AN - SCOPUS:85093927003
SN - 9783030606381
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 3
EP - 16
BT - Pattern Recognition and Computer Vision - 3rd Chinese Conference, PRCV 2020, Proceedings
A2 - Peng, Yuxin
A2 - Zha, Hongbin
A2 - Liu, Qingshan
A2 - Lu, Huchuan
A2 - Sun, Zhenan
A2 - Liu, Chenglin
A2 - Chen, Xilin
A2 - Yang, Jian
PB - Springer Science and Business Media Deutschland GmbH
T2 - 3rd Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2020
Y2 - 16 October 2020 through 18 October 2020
ER -