Detailed Information

Cited 0 time in webofscience Cited 0 time in scopus
Metadata Downloads

MSDLF-K: A Multimodal Feature Learning Approach for Sentiment Analysis in Korean Incorporating Text and Speech

Authors
Kim, Tae-YoungYang, JufengPark, Eunil
Issue Date
2024
Publisher
Institute of Electrical and Electronics Engineers Inc.
Keywords
multimodal deep learning; multimodal representation; Natural language processing; sentiment analysis; speech recognition
Citation
IEEE Transactions on Multimedia
Indexed
SCIE
SCOPUS
Journal Title
IEEE Transactions on Multimedia
URI
https://scholarx.skku.edu/handle/2021.sw.skku/119644
DOI
10.1109/TMM.2024.3521707
ISSN
1520-9210
1941-0077
Abstract
Recently, sentiment analysis research has made significant improvements in addressing sentiment and subjectivity within textual content. The advent of multimodal deep learning techniques has further broadened this scope, enabling the integration of diverse modalities such as voice and image features alongside text. However, despite these advancements, the analysis of the Korean language remains challenging due to its inherently agglutinative nature and linguistic ambiguity, primarily examined at the sentence level. To effectively address this challenge, we propose a novel Multimodal Sentimental Deep Learning Framework for Korean (MSDLF-K), which can examine not only Korean text but also its associated speech. Our framework, MSDLF-K, integrates spectrograms and waveforms from Korean voice data with embedding vectors derived from script sentences, creating a unified multimodal representation. This approach facilitates the identification of both shared and unique features within the latent space, thereby offering valuable insights into their respective impacts on sentiment analysis performance. To validate the efficacy of MSDLF-K, we conducted a set of experiments using the emotion speech synthesis dataset. Our findings demonstrate that MSDLF-K achieves a remarkable accuracy of 79.0% in valence and 81.7% in arousal for emotion classification, metrics previously unexplored in the literature. Furthermore, empirical analysis reveals the significant influence of multimodal representations, encompassing both text and voice, on enhancing emotion analysis performance. In summary, our study not only presents a pioneering solution for sentiment analysis in the Korean language but also underscores the importance of incorporating multimodal approaches for more comprehensive and accurate sentiment analysis across diverse linguistic contexts. © 1999-2012 IEEE.
Files in This Item
There are no files associated with this item.
Appears in
Collections
Computing and Informatics > Convergence > 1. Journal Articles

qrcode

Items in ScholarWorks are protected by copyright, with all rights reserved, unless otherwise indicated.

Related Researcher

Researcher PARK, EUNIL photo

PARK, EUNIL
Computing and Informatics (Convergence)
Read more

Altmetrics

Total Views & Downloads

BROWSE