For Full-Text PDF, please login, if you are a member of IEICE,|
or go to Pay Per View on menu list, if you are a nonmember of IEICE.
Optimal Temporal Decomposition for Voice Morphing Preserving Δ Cepstrum
IEICE TRANSACTIONS on Fundamentals of Electronics, Communications and Computer Sciences
Publication Date: 2004/03/01
Print ISSN: 0916-8508
Type of Manuscript: Special Section PAPER (Special Section on Applications and Implementations of Digital Signal Processing)
Category: Audio/Speech Coding
voice morphing, voice conversion, interpolation of spectra, temporal decomposition, spectrum distortion, DP matching, Δ cepstrum,
Full Text: PDF(415.3KB)>>
We propose Optimal Temporal Decomposition (OTD) of speech for voice morphing preserving Δ cepstrum. OTD is an optimal modification of the original Temporal Decomposition (TD) by B. Atal. It is theoretically shown that OTD can achieve minimal spectral distortion for the TD-based approximation of time-varying LPC parameters. Moreover, by applying OTD to preserving Δ cepstrum, it is also theoretically shown that Δ cepstrum of a target speaker can be reflected to that of a source speaker. In frequency domain interpolation, the Laplacian Spectral Distortion (LSD) measure is introduced to improve the Inverse Function of Integrated Spectrum (IFIS) based non-uniform frequency warping. Experimental results indicate that Δ cepstrum of the OTD-based morphing spectra of a source speaker is mostly equal to that of a target speaker except for a piecewise constant factor and subjective listening tests show that the speech intelligibility of the proposed morphing method is superior to the conventional method.