상세 보기
- Jing, Shuju;
- Li, Xiangyang;
- Beyene, Daniel Asefa;
- Cha, Gichun;
- Park, Seunghee
WEB OF SCIENCE
0SCOPUS
0초록
The high-precision geometric measurement capabilities of sensor-based point clouds provide significant advantages for sewer defect detection. To enhance the classification of valuable yet data-scarce sewer-defect knowledge within the point cloud community, this study proposes a cross-modal framework that combines self-supervised pre-training with supervised fine-tuning. The proposed Sewer Vision Transformer (Sewer-ViT) integrates key-edge sampling, neighborhood dilation learning, dual-domain feature fusion, and inverted bottleneck structures to reinforce defect feature embedding and inductive bias. These features are subsequently processed by a transformer encoder pre-trained with two-dimensional in-domain knowledge, and the latent representations are further optimized through weight fusion within a unified vector space, thereby improving classification performance. The method achieved average precision, recall, and F1-scores of 75.87%, 76.73%, and 75.44% on the overall test set, and 65.09%, 62.47%, and 62.58% on a real-world test set, respectively—surpassing existing approaches. These results highlight the practical potential of this method for sewer defect detection and point to a promising future for multi-modal fusion research.
키워드
- 제목
- Classification of Sewer Defects Using Point Clouds Based on a Novel Sewer Vision Transformer With Cross-Modal In-Domain Knowledge
- 저자
- Jing, Shuju; Li, Xiangyang; Beyene, Daniel Asefa; Cha, Gichun; Park, Seunghee
- 발행일
- 2025-11
- 유형
- Article
- 권
- 25
- 호
- 21
- 페이지
- 40188 ~ 40202