Enhancing Stereo Signals with High-Order Ambisonics Spatial Information

Jorge TREVINO  Shuichi SAKAMOTO  Junfeng LI  Yôiti SUZUKI  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E99-D   No.1   pp.41-49
Publication Date: 2016/01/01
Publicized: 2015/10/21
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2015MUI0001
Type of Manuscript: INVITED PAPER (Special Section on Enriched Multimedia---Creation of a New Society through Value-added Multimedia Content---)
Category: 
Keyword: 
spatial sound,  high-order Ambisonics,  spatialization,  surround,  sound signal encoding,  

Full Text: FreePDF


Summary: 
There is a strong push towards the ultra-realistic presentation of multimedia contents made possible by the latest advances in computational and signal processing technologies. Three-dimensional sound presentation is necessary to convey a natural and rich multimedia experience. Promising ways to achieve this include the sound field reproduction technique known as high-order Ambisonics (HOA). While these advanced methods are now within the capabilities of consumer-level processing systems, their adoption is hindered by the lack of contents. Production and coding of the audio components in multimedia focus on traditional formats such as stereophonic sound. Mainstream audio codecs and media such as CDs or DVDs do not support advanced, rich contents such as HOA encodings. To ameliorate this problem and speed up the adoption of spatial sound technologies, this paper proposes a novel way to downmix HOA contents into a stereo signal. The resulting data can be distributed using conventional methods such as audio CDs or as the audio component of an internet video stream. The results can be listened to using legacy stereo reproduction systems. However, they include spatial information encoded as the inter-channel level and phase differences. The proposed method consists of a downmixing filterbank which independently modulate inter-channel differences at each frequency bin. The proposal is evaluated using simple test signals and found to outperform conventional methods such as matrix-encoded surround and the Ambisonics UHJ format in terms of spatial resolution. The proposal can be coupled with a previously presented method to recover HOA signals from stereo recordings. The resulting system allows for the preservation of full-surround spatial information in ultra-realistic contents when they are transferred using a stereo stream. Simulation results show that a compatible decoder can accurately recover up to five HOA channels from a stereo signal (2nd order HOA data in the horizontal plane).