|   CMU Speech Software   |   CMU Speech Group   |  

FestVox Download
Festival Download
Voice Demos
Limited Domain
Example Databases

CMU_INDIC Databases

Mailing Lists
Search Documents
Contributed parts

CMU_INDIC speech synthesis databases
The CMU_INDIC databases were constructed at the Language Technologies Institute at Carnegie Mellon University as phonetically balanced, single speaker databases designed for corpus based speech synthesis research. They are covering major languages spoken in the Indian subcontinet.

The distributions include the raw waveform files, with transcriptions in the language's native script (etc/txt.done.data file), and also complete built synthesis voices from these databases using CMU Clustergen statistical parameteric speech synthesizer.

Complete android voices for CMU Flite are voice built from these databases are available in the Google Play store. You can hear voices built from these databases here

CMU INDIC Databases

  • All 13 voices are available from packed
  • do_indic a script to download and build a full voice from these databases (assuming FestVox build tools are all installed.
These packed versions contain only the waveform files, and the txt.done.data file.


These datasets were collected and developed with help from Hear2Read. We acknowledge their contributions to making these practical languages for festvox. Special Thanks for to Suresh Bazaj.

CMU/LTI This page is maintained by Alan W Black (awb@cs.cmu.edu)
Festvox is a project within LTI at Carnegie Mellon University