Estimating Syntactic Structure from Prosody in Japanese Speech

Tomoko OHSUGA  Yasuo HORIUCHI  Akira ICHIKAWA  

Publication
IEICE TRANSACTIONS on Information and Systems   Vol.E86-D   No.3   pp.558-564
Publication Date: 2003/03/01
Online ISSN: 
DOI: 
Print ISSN: 0916-8532
Type of Manuscript: Special Section PAPER (Special Issue on Speech Information Processing)
Category: Speech Synthesis and Prosody
Keyword: 
prosodic information,  syntactic structure,  tree structure,  discriminant analysis,  

Full Text: PDF>>
Buy this Article




Summary: 
In this study, we introduce a method for estimating the syntactic structure of Japanese speech from F0 contour and pause duration. We defined a prosodic unit (PU) which is divided by the local minimal point of an F0 contour or pause. Combining PUs repeatedly (a pair of PUs is combined into one PU), a tree structure is gradually generated. Which pair of PUs in a sequence of three PUs should be combined is decided by a discriminant function based on the discriminant analysis of a corpus of speech data. We applied the method to the ATR Phonetically Balanced Sentences read by four Japanese speakers. We found that with this method, the correct rate of judgement for each sequence of three PUs is 79% and the estimation accuracy of the entire syntactic structure for each sentence is 26%. We consider this result to demonstrate a good degree of accuracy for the difficult task of estimating syntactic structure only from prosody.