상세 보기
- Choi, Goeun;
- Kwon, Soonwook;
- Song, Jinwoo;
- Akbar, Ali;
- Hong, Jung-taek
WEB OF SCIENCE
0SCOPUS
0초록
Named Entity Recognition (NER) is crucial for building knowledge bases and facilitating semantic search in the construction industry. While conventional NER models can identify general entities such as spatial and organizational information, extracting domain-specific entities, like materials and dimensions from construction-related texts-particularly in Bill of Quantities (BoQ) and Building Information Modeling (BIM) parameters-remains challenging extensive manual annotation. Key entity categories were defined, and datasets from four BoQ and two BIM sources were annotated to establish ground truth labels. A semi-automated labelling process was introduced to streamline annotation and improve training efficiency. Experimental results demonstrate that the proposed framework reduces annotation time by nearly threefold compared to manual processes. This study developed a BERT-based NER model achieving F1 scores ranging from 0.81 to 0.97, with higher performance for well-defined construction parameters (name, material, size, thickness, diameter, length, type: 0.95-0.97) compared to miscellaneous text entities (0.81). Despite extensive research in construction NLP, existing approaches fail to address the integration challenges between heterogeneous BIM-BoQ data formats and lack domain-specific entity recognition capabilities. The extracted entities are aligned with standardized formats using semantic text similarity techniques. This ontology-based integration enhances data consistency, interoperability, and retrieval accuracy, improving semantic alignment while minimizing discrepancies from heterogeneous terminology.
키워드
- 제목
- ONTOLOGY DEVELOPMENT USING LANGUAGE MODEL-BASED NAMED ENTITY RECOGNITION FOR INTEGRATED CONSTRUCTION INFORMATION
- 저자
- Choi, Goeun; Kwon, Soonwook; Song, Jinwoo; Akbar, Ali; Hong, Jung-taek
- 발행일
- 2026
- 유형
- Article
- 권
- 32
- 호
- 4
- 페이지
- 548 ~ 562