Emotion recognition from speech: tools and challenges

Al-Talabani, Abdulbasit and Sellahewa, Harin and Jassim, Sabah A. (2015) Emotion recognition from speech: tools and challenges. In: SPIE Mobile Multimedia/Image Processing, Security, and Applications, 20 April 2015, Baltimore, Maryland, United States.

AATalabani_etal_SPIE_2015.pdf - Accepted Version

Download (589kB) | Preview
Official URL: http://doi.org/10.1117/12.2191623


Human emotion recognition from speech is studied frequently for its importance in many applications, e.g. human-computer interaction. There is a wide diversity and non-agreement about the basic emotion or emotion-related states on one hand and about where the emotion related information lies in the speech signal on the other side. These diversities motivate our investigations into extracting Meta-features using the PCA approach, or using a non-adaptive random projection RP, which significantly reduce the large dimensional speech feature vectors that may contain a wide range of emotion related information. Subsets of Meta-features are fused to increase the performance of the recognition model that adopts the score-based LDC classifier. We shall demonstrate that our scheme outperform the state of the art results when tested on non-prompted databases or acted databases (i.e. when subjects act specific emotions while uttering a sentence). However, the huge gap between accuracy rates achieved on the different types of datasets of speech raises questions about the way emotions modulate the speech. In particular we shall argue that emotion recognition from speech should not be dealt with as a classification problem. We shall demonstrate the presence of a spectrum of different emotions in the same speech portion especially in the non-prompted data sets, which tends to be more “natural” than the acted datasets where the subjects attempt to suppress all but one emotion. © (2015) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.

Item Type: Conference or Workshop Item (Paper)
Additional Information: Citation format: Abdulbasit Al-Talabani; Harin Sellahewa and Sabah A. Jassim, "Emotion recognition from speech: tools and challenges", Proc. SPIE 9497, Mobile Multimedia/Image Processing, Security, and Applications 2015, 94970N (May 21, 2015); doi:10.1117/12.2191623; http://dx.doi.org/10.1117/12.2191623 Copyright notice format: Copyright 2015 Society of Photo Optical Instrumentation Engineers. One print or electronic copy may be made for personal use only. Systematic reproduction and distribution, duplication of any material in this paper for a fee or for commercial purposes, or modification of the content of the paper are prohibited.” (http://spie.org/x1125.xml)
Uncontrolled Keywords: Classification, Emotion recognition, Dimension reduction
Subjects: Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
Divisions: School of Computing
Depositing User: Harin Sellahewa
Date Deposited: 03 Aug 2015 10:55
Last Modified: 07 Jun 2016 10:37
URI: http://bear.buckingham.ac.uk/id/eprint/38

Actions (login required)

View Item View Item