Vector embeddings (VEmb) are compact data representations learned by large foundation models that have recently gained popularity; their use in place of original data requires less storage and computational power, making them ideal for low-resource computational settings. Nevertheless, the compression of large amounts of data, such as a large medical image, into a low-dimensional VEmb might weaken critical disease information, potentially increasing the risk of underdiagnosis risk -i.e. unhealthy patients flagged as healthy. In this work, we evaluate the underdiagnosis bias of Chest X-ray disease classifiers based on both VEmb and image-based models. Underdiagnosis occurs when an AI model incorrectly classifies unhealthy patients as healthy, leading to missed diagnoses and potential delays in treatment. We make use of VEmb from Google's X-ray foundation model (trained exclusively on X-rays) and BiomedCLIP (trained on various types of medical imaging data). Our findings show that the expert Google VEmb-based model, highly finetuned solely on chest X-rays, reduces underdiagnosis rates, and narrows the underdiagnosis rate gap across subpopulations as well. Conversely, the VEmb from BiomedCLIP, a non-expert model trained on a diverse range of medical images, exhibits lower disease classification performance and higher underdiagnosis rates. This underscores the importance of obtaining VEmb from expert models trained on domain-specific data. Furthermore, we observed reduced demographic signal detection in VEmb of expert foundation models compared to image-based models. This aligns with a decrease in underdiagnosis bias, emphasizing the potential benefits of minimizing demographic signals to achieve fairer outcomes.

Underdiagnosis Bias Mitigation With Expert Foundation Model's Representation

Bahre G. H.;Quarta A.;Calimeri F.;
2025-01-01

Abstract

Vector embeddings (VEmb) are compact data representations learned by large foundation models that have recently gained popularity; their use in place of original data requires less storage and computational power, making them ideal for low-resource computational settings. Nevertheless, the compression of large amounts of data, such as a large medical image, into a low-dimensional VEmb might weaken critical disease information, potentially increasing the risk of underdiagnosis risk -i.e. unhealthy patients flagged as healthy. In this work, we evaluate the underdiagnosis bias of Chest X-ray disease classifiers based on both VEmb and image-based models. Underdiagnosis occurs when an AI model incorrectly classifies unhealthy patients as healthy, leading to missed diagnoses and potential delays in treatment. We make use of VEmb from Google's X-ray foundation model (trained exclusively on X-rays) and BiomedCLIP (trained on various types of medical imaging data). Our findings show that the expert Google VEmb-based model, highly finetuned solely on chest X-rays, reduces underdiagnosis rates, and narrows the underdiagnosis rate gap across subpopulations as well. Conversely, the VEmb from BiomedCLIP, a non-expert model trained on a diverse range of medical images, exhibits lower disease classification performance and higher underdiagnosis rates. This underscores the importance of obtaining VEmb from expert models trained on domain-specific data. Furthermore, we observed reduced demographic signal detection in VEmb of expert foundation models compared to image-based models. This aligns with a decrease in underdiagnosis bias, emphasizing the potential benefits of minimizing demographic signals to achieve fairer outcomes.
2025
AI fairness
AI in medical imaging
chest X-rays
foundation models
vector embedding representation
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/390077
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact