next up previous
Next: Building voices Up: Flite: a small fast Previous: Flite system

Languages, Lexicons and Voices

Flite is the core library. For synthesis, this library require three further three parts to make a complete synthesizer

language model
: providing phoneset, tokenization rules, text analysis, prosodic structures etc. This is not the same as the term ``Language Model'' as it is often applied in Speech Recognition, but rather as an encompassing term for language components that may be shared by many voices.
lexicon
: a pronunciation model including a lexicon and letter to sound rules for out of vocabulary words. The lexicon depends, obviously, upon the unit inventory of the language, and possibly upon the domain.
voice
: the unit inventory, speaker-specific prosody models and the definition of the voice itself. A voice depends upon the primitives provided by the language model.
The first two of these can be shared across voices of the same language. Each of these subsection are compiled into separate libraries.

Unlike Festival, voice definitions are explicitly attached to each utterance as it is created. In Festival there is a notion of a ``current voice'' accessed through a global variable, which is not thread safe. In Flite, all top-level synthesis routines require a voice as an argument, which is then attached as a feature to each create utterance. A voice definition includes the definition of how synthesis is to proceed. This is specified as a C function which calls the necessary sub-functions of tokenization, lexical access, prosody etc. This means voices themselves can specify what steps are required for rendering text as speech. Although Festival could support such a model, it does not by default.

A voice definition consist of a set of feature value pairs setting voice specific aspects such as models for prosody, unit select database to use, lexicon etc. The equivalent in Festival is not so neat (though this model was discussed as a method for Festival at various times).


next up previous
Next: Building voices Up: Flite: a small fast Previous: Flite system
Alan W Black 2001-08-26