Blizzard 2007
in conjunction with the
Sixth ISCA Workshop on Speech Synthesis
Bonn, Germany / August 25, 2007

The Cerevoice Blizzard Entry 2007: Are Small Database Errors Worse than Compression Artifacts?

Matthew P. Aylett (1) J. Sebastian Andersson (2), Leonardo Badino (2), Christopher J. Pidcock (1)

(1) CereProc Ltd, Edinburgh, UK; (2) CSTR, University of Edinburgh, UK

In commercial systems the memory footprint of unit selection systems is often a key issue. This is especially true for PDAs and other embedded devices. In this years Blizzard entry CereProc R gave itself the criteria that the full database system entered would have a smaller memory footprint than either of the two smaller database entries. This was accomplished by applying speex speech compression to the full database entry. In turn a set of small database techniques used to improve the quality of small database systems in last years entry were extended. Finally, for all systems, two quality control methods were applied to the underlying database to improve the lexicon and transcription match to the underlying data. Results suggest that mild audio quality artifacts introduced by lossy compression have almost as much impact on MOS perceived quality as concatenation errors introduced by sparse data in the smaller systems with bulked diphones.

