Options
Natural sounding TTS based on syllable-like units
Date Issued
01-12-2006
Author(s)
Thomas, Samuel
Rao, M. Nageshwara
Indian Institute of Technology, Madras
Ramalingam, C. S.
Abstract
In this work we describe a new .syllable-like. speech unit that is suitable for concatenative speech synthesis. These units are automatically generated using a group delay based segmentation algorithm and acoustically correspond to the form C*VC* (C: consonant, V: vowel). The effectiveness of the unit is demonstrated by synthesizing natural-sounding speech in Tamil, a regional Indian language. Significant quality improvement is obtained if bisyllable units are also used, rather than just monosyllables, with results far superior to the traditional diphone-based approach. An important advantage of this approach is the elimination of prosody rules. Since f 0 is part of the target cost, the unit selection procedure chooses the best unit from among the many candidates. The naturalness of the synthesized speech demonstrates the effectiveness of this approach.