next up previous
Next: Background

Generating F0 contours for speech synthesis using the Tilt intonation theory

Kurt Dusterhoff and Alan W Black

Centre for Speech Technology Research
University of Edinburgh
80 South Bridge

Originally published in the Proceedings of the 1997 ESCA Workshop on intonation, Athens, Greece


This paper presents a method for generating F0 contours for a speech synthesis system using the Tilt intonation theory ([10], [9]). The Tilt theory offers an abstract description of natural F0 contours which may be derived automatically from natural speech. Given a speech database labelled with Tilt events, this paper shows how that data may be used to train a model which can adequately predict Tilt parameters from features available in a text to speech system and hence produce natural sounding F0 contours. After a short description of the Tilt theory, the database used and the necessary features used to generate the parameters are presented. For comparison, this work is contrasted with a previous similar experiment on the same database using the ToBI intonation labelling system [2]. The Tilt method not only produces better results (RMSE 32.5 and correlation 0.60) but as it offers automatic labelling of data, it promises the ability to more easily train from general speech databases.

Kurt Dusterhoff
Tue Jul 1 11:51:11 BST 1997