Measuring the Perceived Importance of Speech Segments for Transmission over IP Networks

Yusuke HIWASAKI  Toru MORINAGA  Jotaro IKEDO  Akitoshi KATAOKA  

IEICE TRANSACTIONS on Communications   Vol.E89-B   No.2   pp.326-333
Publication Date: 2006/02/01
Online ISSN: 1745-1345
DOI: 10.1093/ietcom/e89-b.2.326
Print ISSN: 0916-8516
Type of Manuscript: Special Section PAPER (Special Section on Multimedia QoS Evaluation and Management Technologies)
scalable speech coding,  multiple description coding,  packet networks,  VAD,  QoS,  VoIP,  estimated MOS,  linear regression model,  

Full Text: FreePDF

This paper presents a way of using a linear regression model to produce a single-valued criterion that indicates the perceived importance of each block in a stream of speech blocks. This method is superior to the conventional approach, voice activity detection (VAD), in that it provides a dynamically changing priority value for speech segments with finer granularity. The approach can be used in conjunction with scalable speech coding techniques in the context of IP QoS services to achieve a flexible form of quality control for speech transmission. A simple linear regression model is used to estimate a mean opinion score (MOS) of the various cases of missing speech segments. The estimated MOS is a continuous value that can be mapped to priority levels with arbitrary granularity. Through subjective evaluation, we show the validity of the calculated priority values.