This chapter covers method for finding the pronunciation of a word. This is either by a lexicon (a large list of words and their pronunciations) or by some method of letter to sound rules.

Word pronunciations

A pronunciation in Festival requires not just a list of phones but also a syllabic structure. In some languages the syllabic structure is very simple and well defined and can be unambiguously derived from a phone string. In English however this may not always be the case (compound nouns being the difficult case).

The lexicon structure that is basically available in Festival takes both a word and a part of speech (and arbitrary token) to find the given pronunciation. For English this is probably the optimal form, although there exist homographs in the language, the word itself and a fairly broad part of speech tag will mostly identify the proper pronunciation.

An example entry is

 (((f @ ) 0) ((t o g) 1) ((r @ f) 0) ((ii) 0)))

Not that in addition to explicit marking of syllables a stress value is also given (0 or 1). In some languages lexical is fully predictable, in others highly irregular. In some this field may be more appropriately used for an other purpose, e.g. tone type in Chinese.

There may be other languages which require a more complex (less complex) format and the decision to use some other format rather than this one is up to you.

Currently there is only residual support for morphological analysis in Festival. A finite state transducer based analyzer for English based on the work in [ritchie92] is included in festival/lib/engmorph.scm and festival/lib/engmorphsyn.scm. But this should be considered experimental at best. Give the lack of such an analyzer our lexicons need to list not only based forms of words but also all their morphological variants. This is (more or less) acceptable in languages such as English or French but which languages with richer morphology such as German it may seem an unnecessary requirement. Agglutenative languages such as Finnish and Turkish this appears to be even more a restriction. This is probably true but this current restriction not necessary hopeless. We have successfully build very good letter-to-sound rules for German, a language with a rich morphology which allows the system to properly predict pronunciations of morphological variants of root words it has not seen before. We have not yet done any experiments with Finnish or Turkish but see this technique would work, (though of course developing a properly morphological analyzer would be better).