|
|
Please login using the form on menu list.
It is required to login for Full-Text PDF.
|
Speech Clarity Index (Ψ): A Distance-Based Speech Quality Indicator and Recognition Rate Prediction for Dysarthric Speakers with Cerebral Palsy
Prakasith KAYASITH
Thanaruk THEERAMUNKONG
Publication
IEICE TRANSACTIONS on Information and Systems Vol.E92-D No.3 pp.460-468
Publication Date: 2009/03/01
Online ISSN: 1745-1361
Print ISSN: 0916-8532
Type of Manuscript: PAPER
Category: Speech and Hearing
Keyword: speech disorder,
dysarthric speech recognition,
speech assessment,
speech quality index,
recognition rate prediction,
Full Text: PDF(386.3KB)
Summary: It is a tedious and subjective task to measure severity of a dysarthria by manually evaluating his/her speech using available standard assessment methods based on human perception. This paper presents an automated approach to assess speech quality of a dysarthric speaker with cerebral palsy. With the consideration of two complementary factors, speech consistency and speech distinction, a speech quality indicator called speech clarity index (Ψ) is proposed as a measure of the speaker's ability to produce consistent speech signal for a certain word and distinguished speech signal for different words. As an application, it can be used to assess speech quality and forecast speech recognition rate of speech made by an individual dysarthric speaker before actual exhaustive implementation of an automatic speech recognition system for the speaker. The effectiveness of Ψ as a speech recognition rate predictor is evaluated by rank-order inconsistency, correlation coefficient, and root-mean-square of difference. The evaluations had been done by comparing its predicted recognition rates with ones predicted by the standard methods called the articulatory and intelligibility tests based on the two recognition systems (HMM and ANN). The results show that Ψ is a promising indicator for predicting recognition rate of dysarthric speech. All experiments had been done on speech corpus composed of speech data from eight normal speakers and eight dysarthric speakers.
|
|