CLIP-MDC: CLIP encoder based multimodal defect classification with synthetic anomaly generation for real-time surface defect detection
Citations

WEB OF SCIENCE

0
Citations

SCOPUS

0

초록

In this study, using various text prompts that combine objects and defect types, we establish a semantic space linking images and texts, enabling explainable defect predictions using natural language. We introduce contrastive language-image pre-training-based multimodal defect classification (CLIP-MDC), a framework designed for multimodal defect detection and classification in smart manufacturing. The model integrates a lightweight backbone network with contrastive language-image pre-training (CLIP) encoders to perform both pixel-level anomaly segmentation and image-level defect classification effectively in supervised and weakly supervised settings. Additionally, we incorporate a Perlin noise-based synthetic anomaly generation technique to facilitate learning in environments with limited labeled data, and the dual prediction architecture enables accurate simultaneous inference of defect location and type. In experiments on the MVTec AD and KSDD2 datasets, the model achieved outstanding performance with an area under the receiver operating characteristic curve (AUROC) of 99.9%, an area under the per-region overlap curve (AUPRO) of 98.6%, a pixel-level AUROC (P-AUROC) of 99.9%, and an average precision for localization () of 87.6%. It also demonstrated real-time capability, registering an average inference speed of 6.6ms on an A100 GPU. CLIP-MDC uses a semantic-based multimodal learning framework that combines visual and linguistic information to deliver accuracy, explainability, generalization, and real-time efficiency in defect detection, making it a practical and scalable solution for industrial defect analysis in real-world manufacturing environments.

키워드

Anomaly detectionSynthetic anomaly generationCLIPMultimodal defect classificationWeakly supervised learning
제목
CLIP-MDC: CLIP encoder based multimodal defect classification with synthetic anomaly generation for real-time surface defect detection
저자
Ha, TaewonHwang, ChaeseonJeong, Jongpil
DOI
10.1007/s10845-025-02773-4
발행일
2026-01-17
유형
Article; Early Access
저널명
Journal of Intelligent Manufacturing