The CMU_ARCTIC databases were constructed at the
Language Technologies Institute
at Carnegie Mellon University as phonetically balanced, US English
single speaker databases designed for unit selection speech synthesis
research.
A detailed report on the structure and content of the
database and the recording environment etc is available as a
Carnegie Mellon University, Language Technologies Institute
Tech Report CMU-LTI-03-177 and is also available
here.
The databases consist of around 1150 utterances carefully selected
from out-of-copyright texts from Project Gutenberg. The databses
include US English male (bdl) and female (slt) speakers (both
experinced voice talent) as well as other accented speakers.
The 1132 sentence prompt list is available from
cmuarctic.data
The distributions include 16KHz waveform and simultaneous EGG
signals. Full phoentically labelling was perfromed by the
CMU Sphinx using the FestVox
based labelling scripts. Complete runnable Festival Voices are
included with the database distributions, as examples though
better voices can be made by improving labelling etc.
CMU ARCTIC Databases
CMU ARCTIC other accents
CMU ARCTIC additional databases
CMU ARCTIC all 18 datasets
- All datasets packed
- do_arctic a script to download and build a full voice from these databases (assuming FestVox build tools are all installed).
These distibutions include Festival CLUNITS based voices.
bdl, slt, jmk and awb HTS based voices are available from
available from
http://hts.ics.nitech.ac.jp/ using Nagoya Institute of Technology's
HTS HMM-based Speech Synthesis System.
Acknowledgements
This work was partially supported by the U.S. National Science
Foundation under Grant No. 0219687, "ITR/CIS Evaluation and
Personalization of Synthetic Voices". Any opinions, findings, and
conclusions or recommendations expressed in this material are
those of the authors and do not necessarily reflect the views of
the National Science Foundation.
|