Enhancing multimodal emotion recognition with dynamic fuzzy membership and attention fusion
  • Nguyen, Nhut Minh
  • Nguyen, Trung Minh
  • Nguyen, Thanh Trung
  • Tran, Phuong-Nam
  • Pham, Nhat Truong
  • 외 4명
Citations

WEB OF SCIENCE

1
Citations

SCOPUS

1

초록

Multimodal learning has been shown to enhance classification outcomes in speech emotion recognition (SER). Despite this advantage, multimodal approaches in SER often face key challenges, including limited robustness to uncertainty, difficulty generalizing across diverse emotional contexts, and inefficiencies in integrating heterogeneous modalities. To overcome these constraints, we propose a multimodal emotion recognition architecture, named FleSER, which leverages dynamic fuzzy membership and attention-based fusion. Unlike most previous SER studies that apply fuzzy logic at the decision level, FleSER introduces a feature-level, rule-based dynamic fuzzy membership mechanism that adaptively refines modality representations prior to fusion. The FleSER architecture leverages audio and textual modalities, employing self-modality and cross-modality attention mechanisms with the α interpolation to capture complementary emotional cues. The α interpolation-based feature fusion mechanism adaptively emphasizes the more informative modality across varying contexts, ensuring robust multimodal integration. This comprehensive design enhances recognition accuracy. We evaluate FleSER on multiple benchmark datasets, surpassing previous state-of-the-art (SOTA) approaches and demonstrating superior effectiveness in emotion recognition. Ablation studies further validate the effectiveness of each key component, including unimodal and multimodal input effectiveness, fuzzy membership functions, fusion strategies, and the projection dimension, on the performance of the FleSER architecture.

키워드

Attention-based feature fusionDynamic fuzzy membershipFeature fusionMultimodal emotion recognitionSpeech emotion recognition
제목
Enhancing multimodal emotion recognition with dynamic fuzzy membership and attention fusion
저자
Nguyen, Nhut MinhNguyen, Trung MinhNguyen, Thanh TrungTran, Phuong-NamPham, Nhat TruongLe, LinhOthmani, AliceEl Saddik, AbdulmotalebDang, Duc Ngoc Minh
DOI
10.1016/j.engappai.2025.113396
발행일
2026-02-01
유형
Article
저널명
Engineering Applications of Artificial Intelligence
165