Building Synthetic Voices | ||
---|---|---|
<<< Previous | Lexicons | Next >>> |
Because its impossible to list all words in a natural language for
general text-to-speech you will need to provide something to pronounce
out of vocabulary words. In some languages this is easy but in other's
it is very hard. No matter what you do you must provide
something even if it is simply replacing the unknown word with the word
"unknown" (or its local language equivalent). By default a lexicon
in Festival will throw an error if a requested word isn't found. To
change this you can set the lts_method
. Most usefully you can
reset this to the name of function, which takes a word and a part of
speech specification and returns a word pronunciation as described above.
For example is we are always going to return the
word unknown
but print a warning the the word is being
ignored a suitable function is
Note the pronunciation of "unknown" must be in the appropriate phone set. Also the syllabic structure is required. You need to specify this function for your lexicon as follows(define (mylex_lts_function word feats)
"Deal with out of vocabulary word."
(format t "unknown word: %s\n" word)
'("unknown" n (((uh n) 1) ((n ou n) 1))))
(lex.set.lts.method 'mylex_lts_function)
At one level above merely identifying out of vocabulary words, they can be spelled, this of course isn't ideal but it will allow the basic information to be passed over to the listener. This can be done with the out of vocabulary function, as follows.
A few point are worth noting in this function. This recursively calls the lexical lookup function on the characters in a word. Each letter should appear in the lexicon with its pronunciation (in isolation). But a check is made to ensure we don't recurse for ever. The(define (mylex_lts_function word feats)
"Deal with out of vocabulary words by spelling out the letters in the
word."
(if (equal? 1 (length word))
(begin
(format t "the character %s is missing from the lexicon\" word)
'("unknown" n (((uh n) 1) ((n ou n) 1))))
(cons
word
'n
(apply
append
(mapcar
(lambda (letter)
(car (cdr (cdr (lex.lookup letter 'n)))))
(symbolexplode word))))))
symbolexplode
function assumes that that letters are single
bytes, which may not be true for some languages and that function would
need to be replaced for that language. Note that we append the
syllables of each of the letters in the word. For long words this might
be too naive as there could be internal prosodic structure in such a
spelling that this method would not allow for. In that case you would
want letters to be words thus the symbol explosion to happen at the
token to word level. Also the above function assumes that the part of
speech for letters is n
. This is only really important where
letters are homographs in languages so this can be used to distinguish
which pronunciation you require (cf. "a" in English or "y" in
French). <<< Previous | Home | Next >>> |
Lexicons and addenda | Up | Building letter-to-sound rules by hand |