next up previous
Next: Synthesizing in style Up: Unit Selection Synthesis Previous: The right data

The right domain

But even this may not be enough. Another direction which is obvious too any one who has built a unit selection based speech synthesis. The quality of output reflects very heavily the style and coverage of the recorded databases. This fact can be explicitly exploited by building specific databases for specific applications. As actual applications often use only a limited number of expressions, or at least a well-defined subset of the language. Databases can be designed to cover that space, and not hit the exponential increase in size that a general coverage database may require. [7] describes how to use such techniques to build reliable high quality synthesizers easily for specific applications.

As no significant spectral or prosodic modification is done to the signal in these basic systems, it is not surprising that even general unit selection synthesizers are still somewhat tied to a domain. That is if the voice is based on a news-reader database it will still sound like a news-reader even used for dialog.



Alan W Black 2002-09-30