In some synthesis tasks the range of spoken output required is
actually limited (though can still be infinite). The work
in limited domain synthesis is to try to make the most common phrases
sound the best.
In one extreme, all utterances required can be pre-recorded, but
that is too restrictive. Rather than going to the other extreme of
just recording diphones (or unit selection) some words and phrases
can be record and used appropriately with a diphone (or unit selection)
databases to allow good coverage for common phrases but never
bad coverage for less common forms.
Here we analyze the language generation system used in the task and
build a list of appropriate utterances typical for that domain. We
recorded them, autolabel them and build a unit selection synthesizer.
There results produce a very high quality synthesizer for utterances
within the task domain.
For a more detailed description of these techniques and their
advantages see:
-
Black, A. and Lenzo, K. (2000) Limited Domain Synthesis ICSLP2000,
Beijing, China. (
postscript, html).
|