Predicting Performance of Collaborative Storytelling Using Multimodal Analysis

Shogo OKADA  Mi HANG  Katsumi NITTA  

IEICE TRANSACTIONS on Information and Systems   Vol.E99-D   No.6   pp.1462-1473
Publication Date: 2016/06/01
Publicized: 2016/04/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2015CBP0003
Type of Manuscript: Special Section PAPER (Special Section on Human Cognition and Behavioral Science and Technology)
storytelling performance,  multimodal interaction,  inference,  data mining,  small group,  conversation analysis,  

Full Text: PDF>>
Buy this Article

This study focuses on modeling the storytelling performance of the participants in a group conversation. Storytelling performance is one of the fundamental communication techniques for providing information and entertainment effectively to a listener. We present a multimodal analysis of the storytelling performance in a group conversation, as evaluated by external observers. A new multimodal data corpus is collected through this group storytelling task, which includes the participants' performance scores. We extract multimodal (verbal and nonverbal) features regarding storytellers and listeners from a manual description of spoken dialog and from various nonverbal patterns, including each participant's speaking turn, utterance prosody, head gesture, hand gesture, and head direction. We also extract multimodal co-occurrence features, such as head gestures, and interaction features, such as storyteller utterance overlapped with listener's backchannel. In the experiment, we modeled the relationship between the performance indices and the multimodal features using machine-learning techniques. Experimental results show that the highest accuracy (R2) is 0.299 for the total storytelling performance (sum of indices scores) obtained with a combination of verbal and nonverbal features in a regression task.