Mitigating Racial Bias in Skin Lesion Classification with a Novel Deep Learning-Driven Dataset
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
(1) Background: A critical issue in the application of Machine Learning (ML) in dermatology is the presence of racial bias in training datasets, which can lead to disparities in diagnostic performance across different skin tones. This paper proposes a mitigation strategy for the BCN20000 dataset, addressing its limitations regarding darker skin lesions by augmenting it with a custom dataset. (2) Method: This was achieved by augmenting the BCN20000 dataset with a custom-developed dataset of dark skin lesion images, generated by combining serial style transfer and latent diffusion-based upscaling. The impact of this augmented dataset was assessed on the accuracy of several established Convolutional Neural Network (CNN) architectures, including DenseNet, ConvNeXt, EfficientNet, RegNet and ResNet. (3) Results: Our findings underscore the necessity of diverse datasets for AI-driven dermatology tools with the best performing model achieving 92% accuracy when trained on the augmented dataset containing additional dark-skin images compared to baseline models trained solely on BCN20000. These results align with prior studies emphasizing the critical role of representative training data in mitigating racial bias in medical AI systems.