Identifying Quoted Speech Types

Next: Using a Decision Tree Up: ESPER: architecture Previous: Identifying Spoken Speech in

Identifying Quoted Speech Types

Not every piece of quoted speech is an independent unit. Just as the words in a sentence are connected, there exist connections between adjacent pieces of quoted speech so that they share similar properties such as being spoken by the same speaker. An important feature of ESPER's quote-identification module is the ability to detect whether a piece of quoted speech is a new quote (NEW), where it is most likely spoken by a different speaker from the previous speaker, or a continuation quote (CONT) where it carries on the sentiments from the previous speech and is spoken by the same speaker as that of the previous. A CSML marked-up example of a new and continuation quote is shown below:

<QUOTE TYPE="NEW"> `Come, there's no use
in crying like that!'</QUOTE> said 
Alice to herself, rather sharply; 
<QUOTE TYPE="CONT"> `I advise you to
leave off this minute!' </QUOTE>

Identifying the connectivity between quoted speech segments can significantly reduce the amount of work performed when we are identifying the speakers for each piece of quoted speech. For instance, since we make the assumption that the quoted speech segments that are continuations (CONT) share the same speaker as their predecessors, then once we identify the speaker for a quoted speech segment, we can apply the speaker information from that segment to all its continuation segments, thereby reducing computational effort and potential error build-up.

Next: Using a Decision Tree Up: ESPER: architecture Previous: Identifying Spoken Speech in

Alan W Black 2003-10-20