3D Global and Multi-View Local Features Combination Based Qualitative Action Recognition for Volleyball Game Analysis

Xina CHENG  Yang LIU  Takeshi IKENAGA  

Publication
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E102-A   No.12   pp.1891-1899
Publication Date: 2019/12/01
Online ISSN: 1745-1337
DOI: 10.1587/transfun.E102.A.1891
Type of Manuscript: Special Section PAPER (Special Section on Smart Multimedia & Communication Systems)
Category: Image
Keyword: 
volleyball game analysis,  qualitative action recognition,  3D global feature,  multi-view local feature,  combination framework,  

Full Text: PDF(1.8MB)>>
Buy this Article




Summary: 
Volleyball video analysis plays important roles in providing data for TV contents and developing strategies. Among all the topics of volleyball analysis, qualitative player action recognition is essential because it potentially provides not only the action that being performed but also the quality, which means how well the action is performed. However, most action recognition researches focus on the discrimination between different actions. The quality of an action, which is helpful for evaluation and training of the player skill, has only received little attention so far. The vital problems in qualitative action recognition include occlusion, small inter-class difference and various kinds of appearance caused by the player change. This paper proposes a 3D global and multi-view local features combination based recognition framework with global team formation feature, ball state feature and abrupt pose features. The above problems are solved by the combination of 3D global features (which hide the unstable and incomplete 2D motion feature caused by occlusion) and the multi-view local features (which get detailed local motion features of body parts in multiple viewpoints). Firstly, the team formation extracts the 3D trajectories from the whole team members rather than a single target player. This proposal focuses more on the entire feature while eliminating the personal effect. Secondly, the ball motion state feature extracts features from the 3D ball trajectory. The ball motion is not affected by the personal appearance, so this proposal ignores the influence of the players appearance and makes it more robust to target player change. At last, the abrupt pose feature consists of two parts: the abrupt hit frame pose (which extracts the contour shape of the player's pose at the hit time) and abrupt pose variation (which extracts the pose variation between the preparation pose and ending pose during the action). These two features make difference of each action quality more distinguishable by focusing on the motion standard and stability between different quality actions. Experiments are conducted on game videos from the Semifinal and Final Game of 2014 Japan Inter High School Games of Men's Volleyball in Tokyo Metropolitan Gymnasium. The experimental results show the accuracy achieves 97.26%, improving 11.33% for action discrimination and 91.76%, and improving 13.72% for action quality evaluation.