To explore prosody as a communicative channel, that conveys both linguistic, social, and emotional meanings and to provide a classification model of the emotional properties of speech, using multimodal information from the speech signal, e.g., information about the duration, fundamental frequency, formants, and voice quality.
Emotional communicative agents rely on prosodic information for the identification of emotional states. Previous research using such emotional robots has demonstrated robust techniques for identifying affective intent in robot directed speech. For example, by analyzing the prosody of a person’s speech, robots, such as Kismet and Leonardo, can determine whether the robot was scolded, praised, or given an attentional bid.
Most importantly, the robot can discern these affective intents from neutral indifferent speech. Nevertheless, much more work needs to be done to explore the potentials of prosodic information in speech interaction under a computational framework. These models may potentially be included in robots and discourse agents, such as personal assistants.
The aims of this work include the following:
Supervisor: Charalambos (Haris) Themistocleous