Building Synthetic Voices
<<< Previous	Lexicons	Next >>>

Building letter-to-sound rules by hand

For many languages there is a systematic relationship between the written form of a word and its pronunciation. For some languages this can be fairly easy to write down, by hand. In Festival there is a letter to sound rule system that allows rules to be written, but we also provided a method for building rule sets automatically which will often be more useful. The choice of using hand-written or automatically trained rules depends on the language you are dealing with and the relationship it has between its orthography and its phone set.

For well defined languages like Spanish and Croatian writting rules by hand can be more simple than training. Training requires an existing set of lexical entries to train from and that may be your decision criteria. Hand written letter to sound rules are context dependent re-write rules which are applied in sequence mapping strings letters to string of phones (though the system does not explicitly care what the types of the strings actually will be used for.

The basic form of the rules is

( LC [ alpha ] RC => beta )

Which is interpreter as alpha, a string of one or more symbols on the input tape is written to beta, a string of zero or more symbols on the output tape, when in the presence of LC, a left context of zero or more input symbols, and RC a right context on zero or more input symbols. Note the input tape and the output tape are different, allthough the input and output alphabets need not be distinct the left hand side of a rule only can refer to the input tape and never to anything that has been produce by a right hand side. Thus rules within a ruleset cannot "feed" or "bleed" themselves. It is possible to cascade multiple rule sets, but we will discuss that below.

For example to desl with the pronunciation of the letters "ch" word initially in English we may right two rules like this

( # [ c h ] r => k )
( # [ c h ] => ch )

To deal with words like "christmas", and "chair". Note the # symbol is special and used to denote a word boundary. LTS rule may refer to word boundary but cannot refer to prevous or following words, you would need to do this with some form of post-lexical rule (See the Section called Post-lexical rules) where the word is within some context. In the above rules we are mapping two letters c and h to a single phone k or ch. Also note the order of these rules. The first rule is more specific than the second. This is should appear first to deal with the specific case. In the order were reversed k could never apply as the ch would cover that case too.

Thus LTS rules should be written with the most specific cases first and typically end in a default case. Their should be a default case for all individual letters in the language's alphabet without and context restrictions mapping to some default phone. Therefore following the above rules there would be other c rules with various contexts but the final one should probably be

( [ c ] => k )

As it is a common error in writting these rules, it is worth repeating. If a rule set is to be universally applicable all letters in the input alphabet must have at a rule mapping them to some phone.

The section to be mapped (within square brackets) and the section it is mapped into (after the "=>") must be items in the input and output alphabets and may not include sets or regular expression operators. This does mean more rules need to be explicitly written than you might like, but that will also help you not forget some rules that are required.

For some languages it is conveninet to write a number of rules sets. For example, one to map the input in lower case, and maybe deal with alternate treatments of accent characters e.g. re-write the ASCII "e'" as AWB: e acute. Also we have used rule tests to post process the generated phone string to add stress and syllable breaks.

Finally some people have stressed that writing good letter to sound rules is hard. We would disagree with this, from our experience writing good letter to sound rules by hand is very hard and very skilled and very laborious. For anything but the simplest of languages writting rules by hand requires much more time that people typically have, and will still contain errors (even with an exception list). However hand rules sets may be ideal in some circumstances.

<<< Previous	Home	Next >>>
Out of vocabulary words	Up	Building letter-to-sound rules automatically