Clinically validated depression dataset aligned with DSM-5 criteria for major depressive disorder (MDD)
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

In 2021, over half of American adults reported experiencing depressive symptoms following COVID-19 infection. Left untreated, depression can significantly elevate the risk of harmful behaviors, including substance abuse, which in turn may damage relationships, hinder workplace performance, and impair recovery from serious illnesses. To support early diagnosis and intervention, this study introduces EmoDep (Emotion-related to Depression), a clinically aligned, annotated dataset for depression detection that captures nine core symptoms of major depressive disorder (MDD) and their corresponding intensity levels. We curated this dataset by collecting user-generated posts and comments from the Reddit community r/depression between January 2010 and December 2019, and systematically refined it based on the Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) criteria. With assistance from mental health professionals, we constructed a symptom-specific lexical dictionary reflecting the nine MDD criteria. To annotate intensity, we employed the Best-Worst Scaling (BWS) technique and further applied data augmentation strategies to ensure comprehensive coverage. Overall, the EmoDep dataset provides a reliable textual resource for multi-label classification of depressive symptoms and their severity, facilitating improved detection of depression-related expressions in online platforms. © 2025 The Authors

키워드

Depression datasetMajor depressive disorderNatural language processingSentiment analysis
제목
Clinically validated depression dataset aligned with DSM-5 criteria for major depressive disorder (MDD)
저자
Oh, JaedongLim, JooyoungOh, Hayoung
DOI
10.1016/j.eswa.2025.128691
발행일
2026-01
유형
Article
저널명
Expert Systems with Applications
296