TY - JOUR
T1 - Multi-skeleton structures graph convolutional network for action quality assessment in long videos
AU - Lei, Qing
AU - Li, Huiying
AU - Zhang, Hongbo
AU - Du, Jixiang
AU - Gao, Shangce
N1 - Publisher Copyright:
© 2023, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.
PY - 2023/10
Y1 - 2023/10
N2 - In most existing action quality assessment (AQA) methods, how to score simple actions in short-term sport videos has been widely explored. Recently, a few studies have attempted to solve the AQA problem of long-duration activity by extracting dynamic or static information directly from RGB video. However, these methods may ignore specific postures defined by dynamic changes in human body joints, which makes the results inaccurate and unexplainable. In this work, we propose a novel graph convolution network based on multiple skeleton structure modelling to address the problem of effective pose feature learning to improve the performance of AQA in complex activity. Specifically, three kinds of skeleton structures, including the joints’ self-connection, the intra-part connection, and the inter-part connection, are defined to model the motion patterns of joints and body parts. Moreover, a temporal attention learning module is designed to extract temporal relations between skeleton subsequences. We evaluate the proposed method on two benchmark datasets, the MIT-skate dataset and the Rhythmic Gymnastics dataset. Extensive experiments are conducted to verify the effectiveness of the proposed method. The experimental results show that our method achieves state-of-the-art performance.
AB - In most existing action quality assessment (AQA) methods, how to score simple actions in short-term sport videos has been widely explored. Recently, a few studies have attempted to solve the AQA problem of long-duration activity by extracting dynamic or static information directly from RGB video. However, these methods may ignore specific postures defined by dynamic changes in human body joints, which makes the results inaccurate and unexplainable. In this work, we propose a novel graph convolution network based on multiple skeleton structure modelling to address the problem of effective pose feature learning to improve the performance of AQA in complex activity. Specifically, three kinds of skeleton structures, including the joints’ self-connection, the intra-part connection, and the inter-part connection, are defined to model the motion patterns of joints and body parts. Moreover, a temporal attention learning module is designed to extract temporal relations between skeleton subsequences. We evaluate the proposed method on two benchmark datasets, the MIT-skate dataset and the Rhythmic Gymnastics dataset. Extensive experiments are conducted to verify the effectiveness of the proposed method. The experimental results show that our method achieves state-of-the-art performance.
KW - Action quality assessment
KW - Graph convolutional network
KW - Long sport videos
UR - http://www.scopus.com/inward/record.url?scp=85161389494&partnerID=8YFLogxK
U2 - 10.1007/s10489-023-04613-5
DO - 10.1007/s10489-023-04613-5
M3 - 学術論文
AN - SCOPUS:85161389494
SN - 0924-669X
VL - 53
SP - 21692
EP - 21705
JO - Applied Intelligence
JF - Applied Intelligence
IS - 19
ER -