Go to the first, previous, next, last section, table of contents.


6 Festival's Scheme Programming Language

This chapter acts as a reference guide for the particular dialect of the Scheme programming language used in the Festival Speech Synthesis systems. The Scheme programming language is a dialect of Lisp designed to be more consistent. It was chosen for the basic scripting language in Festival because:

Having a scripting language in Festival is actually one of the fundamental properties that makes Festival a useful system. The fact that new voices and languages in many cases can be added without changing the underlying C++ code makes the system mouch more powerful and accessible than a more monolithic system that requires recompilation for any parameter changes. As there is sometimes confusion we should make it clear that Festival contains its own Scheme interpreter as part of the system. Festival can be view as a Scheme interpreter that has had basic addition to its function to include modules that can do speech synthesis, no external Scheme interperter is required to use Festival.

The actual interpreter used in Festival is based on George Carret's SIOD, "Scheme in one Defun". But this has been substantially enhanced from its small elegant beginnings into something that might be better called "Scheme in one directory". Although there is a standard for Scheme the version in Festival does not fully follow it, for both good and bad reasons. Thus finding in order for people to be able to program in Festival's Scheme we provide this chapter to list the core type, functions, etc and some examples. We do not pretend to be teaching programming here but as we know many people who are interested in building voices are not primarily programmers, some guidance on the language and its usage will make the simple programming that is required in building voices, more accessible.

For reference the Scheme Revised Revised Revised report describes the standard definition srrrr90. For a good introduction to programming in general that happens to use Scheme as its example language we recommend abelson85. Also for those who are unfamiliar with the use of Lisp-like scripting languages we recommend a close look as GNU Emacs which uses Lisp as its underlying scripting language, knowledge of the internals of Emacs did to some extent influence the scripting language design of Festival.

6.1 Overview

"Lots of brackets" is what comes to most people's minds when considering Lisp and its various derivatives such as Scheme. At the start this can seem daunting and it is true that parenthesis errors can cuase problems. But with an editor that does proper bracket matching, brackets can actually be helpful in code structure rather than a hindrance.

The fundamental structure is the s-expression. It consists of an atom, or a list of s-expressions. This simply defined recursive structure allows complex structures to easily be specified. For example

3
(1 2 3)
(a (b c) d)
((a b) (d e))

Unlike other programming languages Scheme's data and code are in the same format, s-expressions. Thus s-expression are evaluated, recursively.

Symbols:
are treated as variables and evaluated return their currently set value.
Strings and numbers:
evalutate to themselves.
Lists:
The each member of the list is evaluated and the first item in the list is treated as a function and applied using the remainer of the list as arguments to the function.

Thus the s-expression

(+ 1 2)

when evaluated will return 3 as the symbol + is bound to a function that adds it arguments.

Variables may be set using the set! function which takes a variable name and a value as arguments

(set! a 3)

The set! function is unusual in that it does not evaluate its first argument. If it did you have to explcitly quote it or set some other variable to have a value of a to get the desired effect.

quoting, define

6.2 Data Types

There a number of basic data types in this Scheme, new ones may also be added but only through C++ functions. This basic types are

Symbols:
symbols atoms starting with an alphabetic character. Unlike numbers and strings, they may be used as variables. Examples are
a bcd f6 myfunc 
plus cond
Symbols may be created from strings by using the function intern
Numbers:
In this version of scheme all numbers are doubles, there is no distinction between floats, doubles and ints. Examples are
1
1.4
3.14
345
3456756.4345476
Numbers evaluate to themselves, that is the value of the atom 2 is the number 2.
Strings:
Strings are bounded by the double quote characters ". For example
"a"
"abc"
"This is a string"
Strings evaluate to themselves. They may be converted to symbols with the function intern. If they are strings of characaters that represent numbers you can convert a string to a number with the function parse-number. For example
(intern "abc") => abc
(parse-number "3.14") => 3.14
Although you can make symbols from numbers you should not do that. Double quotes may be specified within a string by escaping it with a backslash. Backslashes therefore also require an escape backslash. That is, "ab\"c" contains four characters, a, b, " and c. "ab\\c" also contains four characters, a, b, \ and c. And "ab\\\"c" contains five characters a, b, \, " and c.
Lists or Cons
Lists start with a left parenthesis and end with a right parenthesis with zero or more s-expression between them. For example
(a b c)
()
(b (b d) e)
((the boy) saw (the girl (in (the park))))
Lists can be made by various functions most notably cons and list. cons returns a list whose first item is the first item in the list, standardly called its car, and whose remainder, standardly called its cdr, is the second argument of cons.
(cons 'a '(b c)) => (a b c)
(cons '(a b) '(c d)) => ((a b) c d)
Functions:
Functions may be applied explicity bu the function apply or more normally as when the appear as the first item in a list to be evaluated. The normal way to define function is using the define function. For example
(define (ftoc temp)
   (/ (* (- temp 32) 5) 9))
This binds the function to the variable ftoc. Functions can also be defined anonymously which sometimes is convinient.
(lambda (temp)
   (/ (* (- temp 32) 5) 9))
returns a function.
Others:
other internal types are support by Festival's scheme including some inportant object types use for synthesis such as utterances, waveforms, items etc. The are normally printed as in the form
#<Utterance 6234>
#<Wave 1294>
The rpint form is a convinience form only. Enter that string of characters will not allow a reference to that object. The number is unique to that object instance (it is actually the internal address of the object), and can be used visually to note if objects are the same or not.

6.3 Functions

This section lists the basic functions in Festival's Scheme. It doesn't list them all (see the Festival manual for that) but does highlight the key functions that you should normally use.

6.3.1 Core functions

These functions are the basic functions used in Scheme. These include the structural functions for setting variables, conditionals. loops, etc.

(set! SYMBOL VALUE)
Sets SYMBOL to VALUE. SYMBOL is not evaluated, while VALUE is. Example
(set! a 3)
(set! pi 3.14)
(set! fruit '(apples pears bananas))
(set! fruit2 fruit)
(define (FUNCNAME ARG0 ARG1 ...) . BODY)
define a function called FUNCNAME with specified arguments and body.
(define (myadd a b) (+ a b))
(define (factorial a)
 (cond
  ((< a 2) 1)
  (t (* a (factorial (- a 1))))))
(if TEST TRUECASE [FALSECASE] )
If the value of TEST is non-nil, evaluate TRUECASE and return value else if present evaluate FALSECASE if present and return value, else return nil.
(if (string-equal v "apples")
   (format t "It's an apple\n")
   (format t "It's not an apple\n"))
(if (member v '(apples pears bananas))
   (begin
       (format t "It's a fruit (%s)\n" v)
       'fruit)
   'notfruit)
(cond (TEST0 . BODY) (TEST1 . BODY) ...)
A multiple if statement. Evaluates each TEST until a non-nil test is found then evalues each of the expressions in that BODY return the value of the last one.
(cond
  ((string-equal v "apple")
   'ringo)
  ((string-equal v "plum")
   'ume)
  ((string-equal v "peach")
   'momo)
  (t
   'kudamono)
(begin . BODY )
This evaluates each s-expression in BODY and returns the value of the last s-expression in the list. This is useful for case where only one s-expression is expected but you need to call a number of functions, notably the if function.
(if (string-equal v "pear")
    (begin
       (format t "assuming it's a asian pear\n")
       'nashi)
    'kudamono)
(or . DISJ)
evalutate each disjunct until one is non-nil and return that value.
(or (string-equal v "tortoise")
    (string-equal v "turtle"))
(or (string-equal v "pear")
    (string-equal v "apple")
    (< num_fruits 6))
(and . CONJ)
evalutate each conjunct until one is nil and return that value or return the value of the last conjunct.
(and (< num_fruits 10)
     (> num_fruits 3))
(and (string-equal v "pear")
     (< num_fruits 6)
     (or (string-equal day "Tuesday")
         (string-equal day "Wednesday")))

6.3.2 List functions

(car EXPR)
returns the "car" of EXPR, for a list this is the first item, for an atom or the empty list this is defined to be nil.
(car '(a b)) => a
(car '((a b) c d)) => (a b)
(car '(a (b c) d)) => a
(car nil) => nil
(car 'a) => nil
(cdr EXPR)
returns the "cdr" of EXPR, for a list this is the rest of the list, for an atom or the empty list this is defined to be nil.
(cdr '(a b)) => (b)
(cdr '((a b) c d)) => (c d)
(cdr '(a)) => nil
(cdr '(a (b c))) => ((b c))
(cdr nil) => nil
(cdr 'a) => nil
(cons EXPR0 EXPR2)
build a new list whose "car" is EXPR0 and whose "cdr" is EXPR1.
(cons 'a '(b c)) => (a b c)
(cons 'a ()) => (a)
(cons '(a b) '(c d) => '((a b) c d))
(cons () '(a) => '(nil a))
(cons 'a 'b => (a . b))
(cons nil nil) => (nil)
(list . BODY)
Form a list from each of the arguments
(list 'a 'b 'c) => (a b c)
(list '(a b) 'c 'd) => ((a b) c d)
(list nil '(a b) '(a b)) => (nil (a b) (a b))
(append . BODY)
Join each of the arguments (lists) into a single list
(append '(a b) '(c d)) => (a b c d)
(append '(a b) '((c d)) '(e f)) => (a b (c d) e f)
(append nil nil) => nil
(append '(a b)) => (a b))
(append 'a 'b) => error
(nth N LIST)
Return Nth member of list, the first item is the 0th member.
(nth 0 '(a b c)) => a
(nth 2 '(a b c)) => c
(nth 3 '(a b c)) => nil
(nth_cdr N LIST)
Return Nth cdr list, the first cdr is the 0th member, which is the list itself.
(nth 0 '(a b c)) => (a b c)
(nth 2 '(a b c)) => (c)
(nth 1 '(a b c)) => (b c)
(nth 3 '(a b c)) => nil
(last LIST)
The last cdr of a list, traditionally this function has always been called last rather last_cdr
(last '(a b c)) => (c)
(last '(a b (c d))) => ((c d))
(reverse LIST)
Return the list in reverse order
(reverse '(a b c)) => (c b a)
(reverse '(a)) => (a)
(reverse '(a b (c d))) => ((c d) b a)
(member ITEM LIST)
Returns the cdr in LIST whose car is ITEM or nil if it found
(member 'b '(a b c)) => (b c)
(member 'c '(a b c)) => (c)
(member 'd '(a b c)) => nil
(member 'b '(a b c b)) => (b c b)
Note that member uses eq to test equality, hence this does not work for strings. You should use member_string if the list contains strings.
(assoc ITEM ALIST)
a-list are a standard list format for representing feature value pairs. An a-list is basically a list of pairs of name and value, although the name may be any lisp item it is usually an symbol. A typlical a-list is
((name AH)
 (duration 0.095)
 (vowel +)
 (occurs ("file01" "file04" "file07" "file24"))
)
assoc is a function that allows you to look up values in an a-list
(assoc 'name '((name AH) (duration 0.95))) => (name AH)
(assoc 'duration '((name AH) (duration 0.95))) => (duration 0.95)
(assoc 'vowel '((name AH) (duration 0.95))) => nil
Note that assoc uses eq to test equality, hence this does not work names that are strings. You should use assoc_string if the a-list uses strings for names.

6.3.3 Arithmetic functions

+ - * / exp log sqrt < > <= >= =

6.3.4 I/O functions

File names in Festival use the Unix convention of using "/" as the directory separator. However under other operating systems, such as Windows, the "/" will be appropriately mapped into backslash as required. For most cases you do not need to worry about this and if you use forward slash all the time ti will work.

(format FD FORMATSTRING . ARGS)
The format function is a littel unusually in Lisp. It basically follows the printf command in C, or more closely follows the format function in Emacs lisp. It is desgined to print out infomation that isn;t necessarily to be read in by Lisp (unlike pprint, print and printfp). FD is a file descriptor as created by fopen, and the result is printed to that. Also two special values are allows there. t causes the output to be sent to standard out (which is usually the terminal). nil causes the output to be written to a string and returned by the function. Also the variable stderr is set to a file descriptor for standard error output. The format string closely follows the format used in C's printf functions. It is actually interpreted by those functions in its implementation. format supports the following directives
%d
Print as integer
%d
Print as integer in hexadecimal
%f
Print as float
%s
Convert item to string
%%
A percent character
%g
Print as double
%c
Print number as character
%l
Print as Lisp object
In addition directive sizes are supported, including (zero or space) padding, and widths. Explicitly specified sizes as arguments as in %*s are not supported, nor is %p for pointers. The %s directive will try to convert the corresponding lisp argument to a string before passing it to the low level print function. Thus list will be printed to strings, and numbers also coverted. This form will loose the distinction between lisp symbols and lisp strings as the quote will not be present in the %s form. In general %s should be used for getting nice human output and not for machine readable output as it is a lossy print form. In contrast %l is designed to reserve the Lisp forms so they can be more easily read, quotes will appear and escapes for embedded quote will be treated properly.
(format t "duration %0.3f\n" 0.12345) => duration 0.123
(format t "num %d\n" 23) => num 23
(format t "num %04d\n" 23) => num 0023
(pprintf SEXP [FD])
Pretty print give expression to standard out (or FD if specified). Pretty printing is a technique that inserts newlines in the printout and indentation to make the lisp expression easier to read.
(fopen FILENAME MODE)
This creates a file description, which can be used in the various I/O functions. It closely follows C stdio fopen function. The mode may be
"r"
to open the file for reading
"w"
to open the file for writing
"a"
to open the file at the end for writing (so-called, append).
"b"
File I/O in binary (for OS's that make the distinction),
Or any combination of these.
(fclose FD)
Close a file descriptor as created by fopen.
(read)
Read next s-expression from standard in
(readfp FD)
Read next s-expression from given file descriptor FD. On end of file it returns an sexpression eq to the value returned by the function (eof_val). A typical example use of these functions is
(let ((ifd (fopen infile "r"))
      (ofd (fopen outfile "w"))
      (word))
   (while (not (equal? (set! word (readfp ifd)) (eof-val)))
      (format ofd "%l\n" (lex.lookup word nil)))
   (fclose ifd)
   (fclose ofd)))
(load FILENAME [NOEVAL])
Load in the s-expressions in FILENAME. If NOEVAL is unspecified the s-expressions are evaluated as they are read. If NOEVAL is specified and non-nil, load will return all s-expressions in the file un-evaluated in a single list.

6.3.5 String functions

As in many other languages, Scheme has a distinction between strings and symbols. String evaluate to themselves and cannot be assigned other values, symbols of the print name are equal? while strings of teh same name aren't necessarily.

In Festival's Scheme, strings are eight bit clean and designed to hold strings of text and characters in what ever language is being synthesized. Strings are always treats as string of 8 bit characters even though some language may interpret these are 16-bit characters. Symbols, in general, should not contain 8bit characters.

(string-equal STR1 STR2)
Finds the string of STR1 and STR2 and returns t if these are equal, and nil otherwise. Symbol names and numbers are mapped to string, though you should be aware that the mapping of a number to a string may not always produce what you hope for. A number 0 may or may not be mapped to "0" or maybe to "0.0" such that you should not dependent on the mapping. You can use format to map a number ot a string in an explicit manner. It is however safe to pass symbol names to string-equal. In most cases string-equal is the right function to use rather than equal? which is must stricter about its definition of equality.
(string-equal "hello" "hello") => t
(string-equal "hello" "Hello") => false
(string-equal "hello" 'hello) => t
(string-append . ARGS)
For each argument coerce it to a string, and return the concatenation of all arguments.
(string-append "abc" "def") => "abcdef"
(string-append "/usr/local/" "bin/" "festival") => "/usr/local/bin/festival"
(string-append "/usr/local/" t 'hello) => "/usr/local/thello"
(string-append "abc") => "abc"
(string-append ) => ""
(member_string STR LIST)
returns nil if no member of LIST is string-equal to STR, otherwise it returns t. Again, this is often the safe way to check membership of a list as this will work properly if STR or the members of LIST are symbols or strings.
(member_string "a" '("b" "a" "c")) => t
(member_string "d" '("b" "a" "c")) => nil
(member_string "d" '(a b c d)) => t
(member_string 'a '("b" "a" "c")) => t
(string-before STR SUBSTR)
Returns the initial prefix of STR up to the first occurrence of SUBSTR in STR. If SUBSTR doesn't exist within STR the empty string is returned.
(string-before "abcd" "c") => "ab"
(string-before "bin/make_labs" "/") => "bin"
(string-before "usr/local/bin/make_labs" "/") => "usr"
(string-before "make_labs" "/") => ""
(string-after STR SUBSTR)
Returns the longest suffix of STR after the first occurrence of SUBSTR in STR. If SUBSTR doesn't exist within STR the empty string is returned.
(string-after "abcd" "c") => "d"
(string-after "bin/make_labs" "/") => "make_labs"
(string-after "usr/bin/make_labs" "/") => "bin/make_labs"
(string-after "make_labs" "/") => ""
(length STR)
Returns the lengh of given string (or list). Length does not coerce its argument into a string, hence given a symbol as argument is an error.
(length "") => 0
(length "abc") => 3
(length 'abc) -> SIOD ERROR
(length '(a b c)) -> 3
(symbolexplode SYMBOL)
returns a list of single character strings for each character in SYMBOL's print name. This will also work on strings.
(symbolexplode 'abc) => ("a" "b" "c")
(symbolexplode 'hello) => ("h" "e" "l" "l" "o")
(intern STR)
Convert a string into a symbol with the same print name.
(string-matches STR REGEX)
Returns t if STR matches REGEX regular expression. Regular expressions are described more fully below.
(string-matches "abc" "a.*") => t
(string-matches "hello" "[Hh]ello") => t

6.3.6 System functions

In order to interact more easily with the underlying operating system, Festival Scheme includes a number of basic function that allow Scheme programs to make use of the operating system functions.

(system COMMAND)
Evaluates the command with the Unix shell (or equivalent). Its not clear how this should (or does0 work on other operating systems so it should be used sparingly if the code is to be portable.
(system "ls") => lists files in current directory.
(system (format nil "cat %s" filename))
(get_url URL OFILE)
Copies contents of URL into OFILE. It support `file:' and `http:' prefixes, but current does not support the ftp: protocol.
(get_url "http://www.cstr.ed.ac.uk/projects/festival.html" "festival.html")
(setenv NAME VALUE)
Set environment variable NAME to VALUE which should be strings
(setenv "DISPLAY" "nara.mt.cs.cmu.edu:0.0")
(getenv NAME)
Get value of environment variable NAME.
(getenv "DISPLAY")
(getpid)
The process id, as a number. This is useful when creating files that need to be unique for the festival instance.
(set! bbbfile (format nil "/tmp/stuff.%05d" (getpid)))
(cd DIRECTORY)
Change directory.
(cd "/tmp")
(pwd)
return a string which is a pathname to the current working directory.

6.3.7 Utterance Functions

6.3.8 Synthesis Functions

6.4 Debugging and Help

6.5 Adding new C++ functions to Scheme

6.6 Regular Expressions

Regular expressions are fundamentally useful in any text processing language. This is also true in Festival's Scheme. The function string-matches and a number of other places (notably CART trees) allow th eunse of regular expressions to matche strings.

We will not go into the formal aspects of regular expressions but just give enough discussion to help you use them here. See regexbook for probablay more information than you'll ever need.

Each implementation of regex's may be slightly different hence here we will lay out the full syntaxt and semantics of the our regex patterns. This is not an arbitrary selection, when Festival was first developed we use the GNU libg++ Regex class but for portability to non-GNU systems we had replace that with our own impelementation based on Henry Spencer regex code (which is at the core of many regex libraries).

In general all character match themselves except for the following which (can) have special interpretations

. * + ? [ ] - ( ) | ^ $ \

If these are preceded by a backslash then they no longer will have special interpretation.

.
Matches any character.
(string-matches "abc" "a.c") => t
(string-matches "acc" "a.c") => t
*
Matches zero or more occurrences of the preceding item in the regex
(string-matches "aaaac" "a*c") => t
(string-matches "c" "a*c") => t
(string-matches "anythingc" ".*c") => t
(string-matches "canythingatallc" "c.*c") => t
+
Matches one or more occurrences of the preceding item in the regex
(string-matches "aaaac" "a+c") => t
(string-matches "c" "a*c") => nil
(string-matches "anythingc" ".+c") => t
(string-matches "c" ".+c") => nil
(string-matches "canythingatallc" "c.+c") => t
(string-matches "cc" "c.+c") => nil
?
Matches zero or one occurrences of the preceding item. This is it makes the preceding item optional.
(string-matches "abc" "ab?c") => t
(string-matches "ac" "ab?c") => t
[ ]
can defined a set of characters. This can also be used to defined a range. For example [aeiou] is and lower case vowel, [a-z] is an lower case letter from a thru z. [a-zA-Z] is any character upper or lower case. If the ^ is specifed first it negates the class, thus [^a-z] matches anything but a lower case character.

6.7 Some Examples


Go to the first, previous, next, last section, table of contents.