Blizzard 2007
in conjunction with the
Sixth ISCA Workshop on Speech Synthesis
Bonn, Germany / August 25, 2007

Blizzard Entry: Integrated Voice Building and Synthesis for Unit-Selection TTS

Christian Weiss, Sergio Paulo, Luis Figueira, Luis C. Oliveira

Spoken Language Systems Laboratory, INESC-ID, Lisbon, Portugal

In this paper we describe our system used for the 2007 Blizzard Challenge TTS evaluation task. Following the rules we were building three voices from the given speech database where a first voice was created from the full data a second voice was build from the ARCTIC subset data and a third voice from a self-defined subset. The self defined subset was choosen by a text selection algorithm that selected sentences out of the full speech data recordings. Although the Blizzard team provides an already labeled corpus we were buling all the voices from scratch using our own segmentation and pre-processing tools. As result we show in this paper our segmentaion algorithm, the text-selection algorithm for choosing an optimal subset from the full speech data corpus and the voice building and synthesis system itself. Since a TTS system can be divided in an offline and online process we describe the offline pre-processing, the modules we use to prepare the data and describe the online synthesis runtime process how the acoustic soundfiles are generated.

Bibliographic reference.  Weiss, Christian / Paulo, Sergio / Figueira, Luis / Oliveira, Luis C. (2007): "Blizzard Entry: integrated voice building and synthesis for unit-selection TTS", In BLZ3-2007, paper 009.