For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
A Non-Intrusive Speech Intelligibility Estimation Method Based on Deep Learning Using Autoencoder Features
Yoonhee KIM Deokgyu YUN Hannah LEE Seung Ho CHOI
IEICE TRANSACTIONS on Information and Systems
Publication Date: 2020/03/01
Online ISSN: 1745-1361
Type of Manuscript: LETTER
Category: Speech and Hearing
autoencoder, bottleneck feature, STOI, deep learning, long short-term memory (LSTM),
Full Text: PDF>>
This paper presents a deep learning-based non-intrusive speech intelligibility estimation method using bottleneck features of autoencoder. The conventional standard non-intrusive speech intelligibility estimation method, P.563, lacks intelligibility estimation performance in various noise environments. We propose a more accurate speech intelligibility estimation method based on long-short term memory (LSTM) neural network whose input and output are an autoencoder bottleneck features and a short-time objective intelligence (STOI) score, respectively, where STOI is a standard tool for measuring intrusive speech intelligibility with reference speech signals. We showed that the proposed method has a superior performance by comparing with the conventional standard P.563 and mel-frequency cepstral coefficient (MFCC) feature-based intelligibility estimation methods for speech signals in various noise environments.