Automatic Construction Safety Report Using Visual Question Answering and Segmentation Model
  • Tran, Dai Quoc
  • Jeon, Yuntae
  • Son, Seongwoo
  • Kulinan, Almo Senja
  • Lee, Changjun
  • ... Park, Seunghee
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

A construction safety report is crucial for reducing the accident rate in this complex and dangerous working area. With deep learning-based computer vision applications, many computer vision techniques, such as object detection, image segmentation, and semantic segmentation, are utilized for recognizing hazards. However, there is a lack of research on identifying hazards from a natural language processing and computer vision perspective. Visual question answering (VQA) and the segmentation model can fill the gap in this challenge. VQA is a vision and language model that can infer safety violated rules based on the input image. In addition, segmentation models can be applied to automated segmenting objects that are relevant to the violated rules. From that, an end-to-end automatic safety report system is developed. In this research, we proposed a novel approach that combines VQA and the segmentation model in a construction safety report system. First, the large “scenario-questions” with 200,000 images and 16 questions are created based on a public segmentation dataset. Then the vision-and-language transformer model is trained and validated. Next, the segmentation model is utilized for postprocessing to explain the reason for hazard inference and increase baseline model accuracy. As a result, both the VQA and segmentation models showed robustness in validation accuracy and can be considered as a robust approach for safety management and monitoring in construction sites. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

키워드

Automatic Safety ReportSegmentationVisual Question Answering
제목
Automatic Construction Safety Report Using Visual Question Answering and Segmentation Model
저자
Tran, Dai QuocJeon, YuntaeSon, SeongwooKulinan, Almo SenjaLee, ChangjunPark, Seunghee
DOI
10.1007/978-3-031-84224-5_25
발행일
2025-08
유형
Proceedings Paper
저널명
Lecture Notes in Civil Engineering
630 LNCE
페이지
307 ~ 317