Resources

In this chapter we will try to list some of the important resources available that you may need when building a voice in Festival. This list cannot be complete and comprehensive but we will to give references to meta-resources as well as direct references to information code, data that may be of use to you.

This document itself will be updated occasionally and it is worth checking to ensure that you have the latest copy.

Updates, new databases, new language support etc will happen intermittently, new voices will be released which may help you develop your own new voices.

http://festvox.org

has been set up as a resource center for voices in Festival offering databases, examples and repository for voice distribution. Checking that site regularly is a good thing to do.

Specifically

http://festvox.org/examples/cmu_us_kal_diphone/

Offers a complete example US English diphone databes as built using the walkthough in the Chapter called US/UK English Diphone Synthesizer. The originally recorded diphone databases is also available as is, at http://festvox.org/databases/cmu_us_kal_diphone/.

http://festvox.org/examples/cmu_time_awb_ldom/

Offers a complete example limited domain synthesis database as build using the walkthroughs in the Chapter called Limited domain synthesis.

Other databases, lexicons etc will be installed on festvox.org as they become available.

There is also a mailing-list festvox-talk@festvox.org for discussing aspects of building voices. See http://festvox.org/maillist.html for details of joining it and the archive of messages already sent. Also, while traffic is low, feel free to mail the authors awb@cs.cmu.edu or lenzo@cs.cmu.edu and we will try to help where we can.

Festival resources

The Festival home page http://www.cstr.ed.ac.uk/projects/festival/ It is updated regularly as new developments happen.

The Festival Speech Synthesis System code and the Edinburgh Speech Tools library and related programs are available from

ftp://ftp.cstr.ed.ac.uk/pub/festival/

or in the US at

http://www.festvox.org/festival/downloads.html

Note that precompiled versions of the system are also available from that site, though at time of writing only Linux binaries are available.

Festival comes with its own manual and html, postscript and GNU info format. It and a less comprehensive Speech Tools manual are pre-built in festdoc-1.4.1.tar.gz. The manuals are also available on line at

http://www.festvox.org/docs/manual-1.4.2/festival_toc.html
http://www.festvox.org/docs/speech_tools-1.2.0/book1.htm

You will likely need to reference these manuals often.

It will also be useful to have access to other voices development in Festival as seeing how others solve problems may make things clearer.

In addition to Festival itself a number of other projects throughout the world use Festival and have also released resources. The "Related Projects" links give urls to other organizations which you may find useful.

It is worth mentioning Oregon Graduate Institute here who have done a lot of work with the system and release other voices for it (US English and Mexican Spanish). See http://cslu.cse.ogi.edu/tts/ for more details.

A second project worth mention, is the MBROLA project [dutoit96] http://tcts.fpms.ac.be/synthesis/mbrola.html, they offer a waveform synthesis technique [dutoit93] and a number of diphone database for lots of different languages. MBROLA itself doesn't offer a front end, just phone, duration and F0 target to waveform synthesis. (However the do offer a full French TTS system too.) Their diphone databases complement Festival well and a number of projects use MBROLA databases for their waveform synthesis and Festival as the front end. If you lack resources to record and build diphone databases this is a good place to check for existing diphone databases for languages. Most of their databases have some use/distribution restrictions but they usually allow any non-commercial use.