DF-graph: Structured and explainable analysis of communication data for digital forensics
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

Communication data, such as instant messenger exchanges, SMS records, and emails, plays a critical role in digital forensic investigations by revealing criminal intent, interpersonal dynamics, and the temporal structure of events. However, existing AI-based forensic tools frequently hallucinate unverifiable content, obscure their reasoning paths, and ultimately fail to meet the traceability and legal admissibility standards required in criminal investigations. To overcome these challenges, we propose DF-GRAPH, a graph-based retrieval-augmented generation (Graph-RAG) framework designed for forensic question answering over communication data. DF-GRAPH constructs structured knowledge graphs from message logs, retrieves query-relevant subgraphs based on semantic and structural cues, and generates answers guided by forensic-specific prompts. It further enhances legal transparency through rule-based reasoning traces and citation of message-level evidence. We comprehensively evaluate DF-GRAPH across real-world, public, and synthetic datasets, including a narrative dataset adapted from Crime and Punishment. Our evaluation compares four approaches: (1) a direct generation approach using only a language model without retrieval; (2) a BERT embedding-based selective retrieval approach that identifies relevant messages before generation; (3) a conventional text-based retrieval approach; and (4) our proposed graph-based retrieval approach (DF-GRAPH). Empirical results show that DF-GRAPH consistently outperforms all baseline approaches in exact match accuracy (57.23 %), semantic similarity (BERTScore F1: 0.8597), and contextual faithfulness. A user study with eight forensic experts confirms that DF-GRAPH delivers more explainable, accurate, and legally defensible outputs, making it a practical solution for AI-assisted forensic investigations. © 2025 Elsevier B.V., All rights reserved.

키워드

AI communicationDigital forensicsExplainable NLPRetrieval-augmented generation
제목
DF-graph: Structured and explainable analysis of communication data for digital forensics
저자
Lee, JeonginLim, ChaejinJin, BeomjinMin, MoohongKim, Hyoungshick
DOI
10.1016/j.fsidi.2025.301981
발행일
2025-10
유형
Article; Proceedings Paper
저널명
Forensic Science International: Digital Investigation
54