Explainable Hierarchical Swin Transformer for Multi-Scale Breast Cancer Histopathology Classification

IRIS

Accurate and transparent classification of breast cancer histopathology remains a major challenge due to morphological variability, class imbalance, and computational constraints in whole-slide image analysis. Convolutional neural networks (CNNs) capture local tissue features but tend to ignore more global context cues; on the other hand, Vision Transformers are data-hungry and sensitive to staining variations. We provide a systematic, controlled comparison, and propose a hierarchical Swin Transformer framework designed to leverage both local and global representations via adaptive channel recalibration and attention-based feature aggregation on RoI images. Class-balanced upsampling helps further improve robustness against uneven distribution of samples. Evaluations on the BRACS dataset demonstrate performance gains of 7-10 % in the accuracy and F1 score compared to strong CNN and ViT baselines. We assessed multiple explainability techniques to maintain clinical transparency and found that the model highlights tissue regions that are diagnostically meaningful. The proposed framework strikes a good balance between predictive performance and interpretability for computer-aided breast cancer diagnosis.

Explainable Hierarchical Swin Transformer for Multi-Scale Breast Cancer Histopathology Classification

Movahedkor, Narges;Shahbazian, Reza;Trubitsyna, Irina

2026-01-01

Abstract

Accurate and transparent classification of breast cancer histopathology remains a major challenge due to morphological variability, class imbalance, and computational constraints in whole-slide image analysis. Convolutional neural networks (CNNs) capture local tissue features but tend to ignore more global context cues; on the other hand, Vision Transformers are data-hungry and sensitive to staining variations. We provide a systematic, controlled comparison, and propose a hierarchical Swin Transformer framework designed to leverage both local and global representations via adaptive channel recalibration and attention-based feature aggregation on RoI images. Class-balanced upsampling helps further improve robustness against uneven distribution of samples. Evaluations on the BRACS dataset demonstrate performance gains of 7-10 % in the accuracy and F1 score compared to strong CNN and ViT baselines. We assessed multiple explainability techniques to maintain clinical transparency and found that the model highlights tissue regions that are diagnostically meaningful. The proposed framework strikes a good balance between predictive performance and interpretability for computer-aided breast cancer diagnosis.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2026
			
	Codice ISBN
	
				9781643686615
			
	Parole chiave
	
				Breast Cancer Classification
Multi-Scale Attention
Swin Transformer
Transfer Learning
			
	Appare nelle tipologie:
	
				4.1 Contributo in Atti di convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/407897

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

ND

social impact