Comparison of Output Devices for Augmented Audio Reality

Kazuhiro KONDO  Naoya ANAZAWA  Yosuke KOBAYASHI  

IEICE TRANSACTIONS on Information and Systems   Vol.E97-D    No.8    pp.2114-2123
Publication Date: 2014/08/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E97.D.2114
Type of Manuscript: PAPER
Category: Speech and Hearing
augmented audio reality,  speech intelligibility,  mobile audio navigation,  bone-conduction headphone,  binaural microphone/earphone,  

Full Text: PDF(1MB)>>
Buy this Article

We compared two audio output devices for augmented audio reality applications. In these applications, we plan to use speech annotations on top of the actual ambient environment. Thus, it becomes essential that these audio output devices are able to deliver intelligible speech annotation along with transparent delivery of the environmental auditory scene. Two candidate devices were compared. The first output was the bone-conduction headphone, which can deliver speech signals by vibrating the skull, while normal hearing is left intact for surrounding noise since these headphones leave the ear canals open. The other is the binaural microphone/earphone combo, which is in a form factor similar to a regular earphone, but integrates a small microphone at the ear canal entry. The input from these microphones can be fed back to the earphones along with the annotation speech. We also compared these devices to normal hearing (i.e., without headphones or earphones) for reference. We compared the speech intelligibility when competing babble noise is simultaneously given from the surrounding environment. It was found that the binaural combo can generally deliver speech signals at comparable or higher intelligibility than the bone-conduction headphones. However, with the binaural combo, we found that the ear canal transfer characteristics were altered significantly by shutting the ear canals closed with the earphones. Accordingly, if we employed a compensation filter to account for this transfer function deviation, the resultant speech intelligibility was found to be significantly higher. However, both of these devices were found to be acceptable as audio output devices for augmented audio reality applications since both are able to deliver speech signals at high intelligibility even when a significant amount of competing noise is present. In fact, both of these speech output methods were able to deliver speech signals at higher intelligibility than natural speech, especially when the SNR was low.

open access publishing via