TY - GEN
T1 - Shared-Specific Feature Enhancement and Dual Distilled Contextual Graph Refinement Network for Multimodal Conversational Emotion Recognition
AU - Li, Fangkun
AU - Ma, Yulan
AU - Li, Yang
N1 - Publisher Copyright:
© 2024 IEEE.
PY - 2024
Y1 - 2024
N2 - Emotion Recognition in conversation (ERC) is a crucial task in building empathetic machines. Previous studies usually treat multimodal (i.e., visual, audio, text) equally and rarely focus on the shared and specific features among different modalities, leading to the redundancy of multimodal features. Besides, the extraction of contextual information in a dialogue and multimodal knowledge transfer remains challenging, which results in the inadequacy of capturing relational and contextual semantics. To solve these issues, we propose a Shared-Specific Feature Enhancement and Dual Distilled Contextual Graph Refinement Network (S2FE-D2CGRN) for ERC task. Specifically, a shared-specific feature enhancement module (S2FEM) is first designed to enhance the text-guided shared semantics and make the specific features discriminative. Second, to refine and distill the contextual knowledge, a dual-distilled contextual graph refinement module (D2CGRM) is investigated, which includes a contextual graph construction and a two-stage knowledge distillation. Extensive experiments on two multimodal public datasets show the effectiveness of our proposed method compared with the state-of-the-art methods, indicating its potential application in conversational emotion recognition.
AB - Emotion Recognition in conversation (ERC) is a crucial task in building empathetic machines. Previous studies usually treat multimodal (i.e., visual, audio, text) equally and rarely focus on the shared and specific features among different modalities, leading to the redundancy of multimodal features. Besides, the extraction of contextual information in a dialogue and multimodal knowledge transfer remains challenging, which results in the inadequacy of capturing relational and contextual semantics. To solve these issues, we propose a Shared-Specific Feature Enhancement and Dual Distilled Contextual Graph Refinement Network (S2FE-D2CGRN) for ERC task. Specifically, a shared-specific feature enhancement module (S2FEM) is first designed to enhance the text-guided shared semantics and make the specific features discriminative. Second, to refine and distill the contextual knowledge, a dual-distilled contextual graph refinement module (D2CGRM) is investigated, which includes a contextual graph construction and a two-stage knowledge distillation. Extensive experiments on two multimodal public datasets show the effectiveness of our proposed method compared with the state-of-the-art methods, indicating its potential application in conversational emotion recognition.
KW - contextual graph
KW - emotion recognition
KW - knowledge distillation
KW - shared-specific features
UR - https://www.scopus.com/pages/publications/105003135994
U2 - 10.1109/IARCE64300.2024.00059
DO - 10.1109/IARCE64300.2024.00059
M3 - 会议稿件
AN - SCOPUS:105003135994
T3 - Proceedings - 2024 4th International Conference on Industrial Automation, Robotics and Control Engineering, IARCE 2024
SP - 278
EP - 282
BT - Proceedings - 2024 4th International Conference on Industrial Automation, Robotics and Control Engineering, IARCE 2024
PB - Institute of Electrical and Electronics Engineers Inc.
T2 - 4th International Conference on Industrial Automation, Robotics and Control Engineering, IARCE 2024
Y2 - 15 November 2024 through 17 November 2024
ER -