Prosodic Characteristics of Japanese Conversational Speech

Nobuyoshi KAIKI  Yoshinori SAGISAKA  

IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences   Vol.E76-A    No.11    pp.1927-1933
Publication Date: 1993/11/25
Online ISSN: 
Print ISSN: 0916-8508
Type of Manuscript: Special Section PAPER (Special Section on Speech Synthesis: Current Technologies and Thier Application)
prosody,  conversation,  fundamental frequency,  amplitude,  segmental duration,  

Full Text: PDF(579.8KB)>>
Buy this Article

In this paper, we quantitively analyzed speech data in seven different styles to make natural Japanese conversational speech synthesis. Three reading styles were produced at different speeds (slow, normal and fast), and four speaking styles were produced by enacting conversation in different situations (free, hurried, angry and polite). To clarify the differences in prosodic characteristics between conversational speech and read speech, means and standard deviations of vowel duration, vowel amplitude and fundamental frequency (F0) were analyzed. We found large variation in these prosodic parameters. To look more precisely at the segmental duration and segmental amplitude differences between conversational speech and read speech, control rules of prosodic parameters in reading styles were applied to conversational speech. F0 contours of different speaking styles are superposed by normalizing the segmental duration. The differences between estimated values and actual values were analyzed. Large differences were found at sentence final and key (focused) phrases. Sentence final positions showed lengthening of segmental vowel duration and increased segmental vowel amplitude. Key phrase positions featured raising F0.