Cutaneous melanoma presents a profound healthcare challenge, particularly for individuals with darker skin tones, where late diagnosis significantly increases mortality rates. Despite remarkable advancements in artificial intelligence for medical diagnostics, current dermatological image classification systems suffer from a critical ethical and methodological limitation: severe underrepresentation of diverse skin tones in training datasets. This research uses MultiExCam, our novel multi-approach explainable architecture, to quantitatively demonstrate the systemic bias in melanoma detection across different skin tones. Our contributions are threefold: first, we comprehensively analyze major dermatological image repositories, documenting the severe underrepresentation of Fitzpatrick skin types V-VI across all datasets examined; second, we introduce Pipsqueak, a meticulously curated dataset of melanocytic lesions in darker skin tones, which demonstrates the profound scarcity of diverse representation in existing resources; and third, through empirical validation, we quantify performance disparities that emerge when models trained predominantly on light skin images are applied to darker skin tones, revealing accuracy drops that could translate to potentially fatal clinical consequences. This work provides crucial evidence for the urgent need to develop more inclusive diagnostic technologies that can effectively serve all populations, regardless of skin tone, and challenges the field to prioritize deliberate collection of diverse dermatological data.
Bias in Dermatological Datasets: A Critical Analysis of the Underrepresentation of Dark Skin Tones in Melanoma Classification Images
Ruga, Tommaso;Zumpano, Ester;Vocaturo, Eugenio;Caroprese, Luciano;
2025-01-01
Abstract
Cutaneous melanoma presents a profound healthcare challenge, particularly for individuals with darker skin tones, where late diagnosis significantly increases mortality rates. Despite remarkable advancements in artificial intelligence for medical diagnostics, current dermatological image classification systems suffer from a critical ethical and methodological limitation: severe underrepresentation of diverse skin tones in training datasets. This research uses MultiExCam, our novel multi-approach explainable architecture, to quantitatively demonstrate the systemic bias in melanoma detection across different skin tones. Our contributions are threefold: first, we comprehensively analyze major dermatological image repositories, documenting the severe underrepresentation of Fitzpatrick skin types V-VI across all datasets examined; second, we introduce Pipsqueak, a meticulously curated dataset of melanocytic lesions in darker skin tones, which demonstrates the profound scarcity of diverse representation in existing resources; and third, through empirical validation, we quantify performance disparities that emerge when models trained predominantly on light skin images are applied to darker skin tones, revealing accuracy drops that could translate to potentially fatal clinical consequences. This work provides crucial evidence for the urgent need to develop more inclusive diagnostic technologies that can effectively serve all populations, regardless of skin tone, and challenges the field to prioritize deliberate collection of diverse dermatological data.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


