Development of Acoustic Nonverbal Information Estimation System for Unconstrained Long-Term Monitoring of Daily Office Activity

Hitomi YOKOYAMA  Masano NAKAYAMA  Hiroaki MURATA  Kinya FUJITA  
[Paper on system development]

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E102-D   No.2   pp.331-345
Publication Date: 2019/02/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2018EDK0005
Type of Manuscript: PAPER
Category: Human-computer Interaction
Keyword: 
acoustic nonverbal information system,  speaker estimation,  long-term monitoring,  daily office conversations,  

Full Text: PDF(1.1MB)
>>Buy this Article


Summary: 
Aimed at long-term monitoring of daily office conversations without recording the conversational content, a system is presented for estimating acoustic nonverbal information such as utterance duration, utterance frequency, and turn-taking. The system combines a sound localization technique based on the sound energy distribution with 16 beam-forming microphone-array modules mounted in the ceiling for reducing the influence of multiple sound reflection. Furthermore, human detection using a wide field of view camera is integrated to the system for more robust speaker estimation. The system estimates the speaker for each utterance and calculates nonverbal information based on it. An evaluation analyzing data collected over ten 12-hour workdays in an office with three assigned workers showed that the system had 72% speech segmentation detection accuracy and 86% speaker identification accuracy when utterances were correctly detected. Even with false voice detection and incorrect speaker identification and even in cases where the participants frequently made noise or where seven participants had gathered together for a discussion, the order of the amount of calculated acoustic nonverbal information uttered by the participants coincided with that based on human-coded acoustic nonverbal information. Continuous analysis of communication dynamics such as dominance and conversation participation roles through nonverbal information will reveal the dynamics of a group. The main contribution of this study is to demonstrate the feasibility of unconstrained long-term monitoring of daily office activity through acoustic nonverbal information.