The CMU_SIN database was constructed at the Language Technologies
Institute at Carnegie Mellon University as a database of speech in noise
designed for use in unit selection speech synthesis research. This work
builds on the CMU
Arctic database. This work was carried out as part of the
Let's Go Project improving
spoken dialog systems for non-natives and the elderly.
The distributions include a 500 utterance subset from the CMU
Arctic database, recorded from one male US English speaker. There
are two versions of these recordings: one where the speaking style
is speech in noise (sin) and one with normal speaking style
(swn). We also include two complete Festival voices built from
these recordings. It is possible to build a voice from both sets
of recordings that can speak both normally and in noise, though we
are not distributing such a voice at this time. The distribution
also contains the modifications to the FestVox voice building
script prompt_them to provide a method of obtaining clean
recordings of speech in noise, as well as the noise source used
for these recordings.
Full phonetic labeling was performed with CMU Sphinx using the
FestVox based labeling scripts. No hand-correction of the
automatic labels has been done.
A description of this work is published in
Creating a Database of Speech in Noise for Unit Selection Synthesis
by Brian Langner and Alan W Black, at the
5th ISCA Speech Synthesis Workshop
CMU SIN Database
Acknowledgements
This work is supported by the US National Science Foundation under
grant number 0208835, "LET'S GO: improved speech interfaces for
the general public". Any opinions, findings, and conclusions or
recommendations expressed in this material are those of the
authors and do not necessarily reflect the views of the National
Science Foundation.
|