Skip to main navigation Skip to search Skip to main content

Automatic speech discrete labels to dimensional emotional values conversion method

  • Shaoling Jing
  • , Xia Mao
  • , Lijiang Chen*
  • *Corresponding author for this work
  • Beihang University

Research output: Contribution to journalArticlepeer-review

Abstract

Dimensional emotion estimation (e.g. arousal and valence) from spontaneous and realistic expressions has drawn increasing commercial attention. However, the application of dimensional emotion estimation technology remains a challenge due to issues such as manual annotation and evaluation. In this work, the authors introduce an automatic annotation and emotion prediction model. The automatic annotation is performed through three main steps: (i) label initialisation, (ii) automatic label annotation, and (iii) label optimisation. The approach has been validated on different language databases with different types of emotion expressions, including spontaneous, acted and induced emotional expressions. Compared with non-optimisation of the predicted labels, the process of optimisation improves the concordance correlation coefficient (CCC) values by an average of 0.104 for arousal and 0.051 for valence. Furthermore, the standard variation between annotated values and the ground truth is reduced to an average of 0.44 for arousal and 0.34 for valence. Finally, the CCC values using the proposed model reach 0.58 for arousal and 0.28 for valence, which further verifies the feasibility and reliability of the proposed model. The proposed method can be used to reduce labour intensive and time-consuming manual annotation work.

Original languageEnglish
Pages (from-to)168-176
Number of pages9
JournalIET Biometrics
Volume8
Issue number2
DOIs
StatePublished - 1 Mar 2019

Fingerprint

Dive into the research topics of 'Automatic speech discrete labels to dimensional emotional values conversion method'. Together they form a unique fingerprint.

Cite this