YOLO-augment strategy with diffusion-based inpainting for enhanced traffic sign detection

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

PeerJ Computer Science

Abstract

Traffic sign datasets often suffer from data scarcity and class imbalance, which challenge the development of robust autonomous driving systems. This article proposes a novel dataset augmentation method that leverages Stable Diffusion inpainting to generate realistic synthetic traffic signs. The method fine-tunes a Stable Diffusion model and introduces an object-size-based crop (OSB-crop) technique with mask adjustments to ensure high-quality augmentations that maintain contextual consistency. Evaluations using the Fréchet Inception Distance (FID) show average scores of 195.85 for the DFG-T10 subset and 247.077 for the DFG-B10 subset, demonstrating the ability to produce realistic inpainted signs, particularly for more represented minority classes. Qualitative analyses further highlight seamless integration into real-world scenes, although challenges remain for extremely underrepresented classes and ensuring perfect visual fidelity. The benefits of this approach include its potential to enhance traffic sign datasets, address class imbalances, and improve the potential for training more reliable autonomous driving systems by providing more diverse and realistic training data. This study focuses on evaluating the quality of the generated data itself as a foundational step toward enhancing downstream detection models. However, limitations include the computational cost of fine-tuning and the difficulty in achieving high-quality inpainting for all underrepresented classes, especially those with poor initial data quality. This work lays a strong foundation for advancing dataset augmentation techniques for real-world applications.

Description

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By