Skip to main navigation Skip to search Skip to main content

Coverage-guaranteed speech emotion recognition via calibrated uncertainty-adaptive prediction sets

  • Beihang University
  • Chongqing University of Posts and Telecommunications

Research output: Contribution to journalArticlepeer-review

Abstract

Road rage, often triggered by emotional suppression and sudden outbursts, significantly threatens road safety by causing collisions and aggressive behavior. Speech emotion recognition technologies can mitigate this risk by identifying negative emotions early and issuing timely alerts. However, current speech emotion recognition methods, such as those based on hidden Markov models and Long short-term memory networks, primarily handle one-dimensional signals, frequently experience overfitting, and lack calibration, limiting their safety–critical effectiveness. We propose a novel risk-controlled prediction framework providing statistically rigorous guarantees on prediction accuracy. This approach employs a calibration set to define a binary loss function indicating whether the true label is included in the prediction set. Using a data-driven threshold β, we optimize a joint loss function to maintain an expected test loss bounded by a user-specified risk level α. Evaluations across five baseline models and three benchmark datasets demonstrate our framework consistently achieves a minimum coverage of 1−α, effectively controlling marginal error rates despite varying calibration-test split ratios (e.g., 0.1). The robustness and generalizability of the framework are further validated through an extension to small-batch online calibration under a local exchangeability assumption. We construct a non-negative test martingale to maintain prediction validity even in dynamic and non-exchangeable environments. Cross-corpus and cross-language tests confirm our method's ability to uphold reliable statistical guarantees in realistic, evolving data scenarios.

Original languageEnglish
Article number111721
JournalEngineering Applications of Artificial Intelligence
Volume159
DOIs
StatePublished - 15 Nov 2025

Keywords

  • Data-driven threshold
  • Non-negative test martingale
  • Risk control
  • Speech emotion recognition

Fingerprint

Dive into the research topics of 'Coverage-guaranteed speech emotion recognition via calibrated uncertainty-adaptive prediction sets'. Together they form a unique fingerprint.

Cite this