Blind Bandwidth Extension with a Non-Linear Function and Its Evaluation on Automatic Speaker Verification

Ryota KAMINISHI  Haruna MIYAMOTO  Sayaka SHIOTA  Hitoshi KIYA  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E103-D   No.1   pp.42-49
Publication Date: 2020/01/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.2019MUP0008
Type of Manuscript: Special Section PAPER (Special Section on Enriched Multimedia — Application of Multimedia Technology and Its Security —)
Category: 
Keyword: 
blind bandwidth extension,  non-linear function,  automatic speaker verification,  i-vector,  x-vector,  

Full Text: PDF(1.2MB)>>
Buy this Article




Summary: 
This study evaluates the effects of some non-learning blind bandwidth extension (BWE) methods on state-of-the-art automatic speaker verification (ASV) systems. Recently, a non-linear bandwidth extension (N-BWE) method has been proposed as a blind, non-learning, and light-weight BWE approach. Other non-learning BWEs have also been developed in recent years. For ASV evaluations, most data available to train ASV systems is narrowband (NB) telephone speech. Meanwhile, wideband (WB) data have been used to train the state-of-the-art ASV systems, such as i-vector, d-vector, and x-vector. This can cause sampling rate mismatches when all datasets are used. In this paper, we investigate the influence of sampling rate mismatches in the x-vector-based ASV systems and how non-learning BWE methods perform against them. The results showed that the N-BWE method improved the equal error rate (EER) on ASV systems based on the x-vector when the mismatches were present. We researched the relationship between objective measurements and EERs. Consequently, the N-BWE method produced the lowest EERs on both ASV systems and obtained the lower RMS-LSD value and the higher STOI score.