Predictors of Pause Duration in Read-Aloud Discourse

Xiaohong YANG  Mingxing XU  Yufang YANG  

IEICE TRANSACTIONS on Information and Systems   Vol.E97-D   No.6   pp.1461-1467
Publication Date: 2014/06/01
Online ISSN: 1745-1361
DOI: 10.1587/transinf.E97.D.1461
Type of Manuscript: Special Section PAPER (Special Section on Advances in Modeling for Real-world Speech Information Processing and its Application)
Category: Speech Synthesis and Related Topics
pause duration,  syntactic structure,  discourse hierarchy,  topic structure,  phrase length,  

Full Text: PDF>>
Buy this Article

The research reported in this paper is an attempt to elucidate the predictors of pause duration in read-aloud discourse. Through simple linear regression analysis and stepwise multiple linear regression, we examined how different factors (namely, syntactic structure, discourse hierarchy, topic structure, preboundary length, and postboundary length) influenced pause duration both separately and jointly. Results from simple regression analysis showed that discourse hierarchy, syntactic structure, topic structure, and postboundary length had significant impacts on boundary pause duration. However, when these factors were tested in a stepwise regression analysis, only discourse hierarchy, syntactic structure, and postboundary length were found to have significant impacts on boundary pause duration. The regression model that best predicted boundary pause duration in discourse context was the one that first included syntactic structure, and then included discourse hierarchy and postboundary length. This model could account for about 80% of the variance of pause duration. Tests of mediation models showed that the effects of topic structure and discourse hierarchy were significantly mediated by syntactic structure, which was most closely correlated with pause duration. These results support an integrated model combining the influence of several factors and can be applied to text-to-speech systems.