Relating phonemes to sounds is not obvious as people think. Even when one is familar with phone sets its easy to make mistakes when reading lists of phones alone. This is particularly true in reading diphone nonsense words. The table provided here are intended for both the experienced and inexperienced reader of phones, to help you decide on the pronunciation.
These tables are not supposed to be a substitute for a good phonetics course, they are intended to give people a basic idea of the pronunciation of the phone sets used in the particaulr examples in this document. Many simplifying assumptions have been made, and often aren't even mentioned. To the phoneticians out there I apologise, as much as the assumptions are wrong we are here listing atomic discrete phones which we have found useful in building practical systems, even though better sets probably exist.
Inspite of everyone telling you that there is one and only one US phoneset, when it comes to actually using one you quickly discover there are actually many standard one used by lots of different pieces of software, often the difference betwen them is trivial (e.g. case folding) but computers being fundamentally dumb can't take these trivial differences into account. Here we list the radio phoneset which is used by standard US voices in festival. The definition is in festival/lib/radio_phones.scm. This list was based on those phones that appear in the Boston University FM radio corpus with minor modifications. The list here is exactly those phones which appear in the diphone nonses words as used in the example explained in the Chapter called US/UK English Diphone Synthesizer.
lAWn, dOOr, mAll
hOW, sOUth, brOWser
fERtile, sEARch, makER
Camera, jaCK, Kill
Zero, quiZ, boyS
The use of - (hyphen) in the nonsense word itself is used to denot
an explicit syllable boundary. Thus
pau t aa n - k aa pau
is used to state that the word should be pronounced as
tank ah. Where no explicit syllable boundary
is given the pronunciation should be pronounce naturally without
any boundary (which is probably too underspecified in some cases).
The use of _ (underscore) in phone names is used to denote
consonant clusters. That is
t_-_r is the /tr/ as found
trip not that in