Zero-/Few-Shot Anomaly Classification for Transistor Using Multimodal CLIP Retrieval Augmented
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Detecting anomalies in transistors is a challenging task due to the intricate features distinguishing normal from abnormal components. This paper introduces a zero-shot and few-shot anomaly classification framework for transistors using a retrieval-augmented multimodal approach. We focus on two critical parts of the transistor: the Component BOX and the Metal Legs. Utilizing Florence2 for prompt-based bounding box generation and Segment Anything Model2 (SAM2) for segmentation, we create precise masks for each part. Embeddings generated through Contrastive Language-Image Pre-training (CLIP) are employed to classify each component effectively. For the few-shot learning scenario, we implement Retrieval-Augmented Generation (RAG) to simulate learning from both images and textual data, enhancing the anomaly classification performance. Our zero-shot model achieved an fl score of 70.5, while the few-shot model attained an improved fl score of 77.0, demonstrating the efficacy of our approach in transistor anomaly detection.