Abstract
It has been long speculated that expression of emotions from different modalities have the same underlying 'code', whether it be a dance step, musical phrase, or tone of voice. This is the first attempt to implement this theory across three modalities, inspired by the polyvalence and repeatability of robotics. We propose a unifying framework to generate emotions across voice, gesture, and music, by representing emotional states as a 4-parameter tuple of speed, intensity, regularity, and extent (SIRE). Our results show that a simple 4-tuple can capture four emotions recognizable at greater than chance across gesture and voice, and at least two emotions across all three modalities. An application for multi-modal, expressive music robots is discussed.
Original language | English |
---|---|
Article number | 3 |
Journal | Eurasip Journal on Audio, Speech, and Music Processing |
Volume | 2012 |
Issue number | 1 |
DOIs | |
Publication status | Published - 2012 |
Externally published | Yes |
Keywords
- Affective computing
- Entertainment robots
- Gesture
ASJC Scopus subject areas
- Acoustics and Ultrasonics
- Electrical and Electronic Engineering