MIT Department of Electrical Engineering & Computer Science

E E C S

Computational Modeling of Prosody for Automatic Speech Synthesis and Understanding

Mari Ostendorf
Boston University

Wednesday, February 24, 1999
4:00 PM (refreshments 3:45)
Grier Room, Room 34-401A
EECS Special Seminar

Abstract

Prosody includes the phrasing and emphasis in a spoken word sequence and can be thought of as the punctuation in speech, though it actually carries much more information than punctuation in written text. Prosodic patterns can be an important source of information for interpreting an utterance, both for human listeners and computers. However, prosody is currently underutilized in computer speech processing, because of the difficulty of modeling events at several different time scales and because of a past focus on a neutral reading style that does not generalize to many applications.

In this talk, we describe a computational model of prosody that addresses these challenges by combining results from linguistics research with the theory of statistical pattern recognition. Statistical modeling enables automatic learning to capture style-dependent variability, while linguistic theory informs a multi-level structure that models both local detail and global trends. The framework also includes components for both the continuous acoustic realization and the symbolic phonotactic structure, analogous to component models used in speech recognition. We will show how the same model can be used in both recognition and synthesis applications, giving examples for several speech processing problems.


URL of this page: http://www-eecs.mit.edu/AY98-99/events/14.html
Created: Feb 11, 1999  | Modified: Feb 12, 1999
This event is from the MIT EECS 1998-99 archive.  | Current events
To MIT EECS home page  | Your comments and inquiries are welcome.