Accurate and transparent classification of breast cancer histopathology remains a major challenge due to morphological variability, class imbalance, and computational constraints in whole-slide image analysis. Convolutional neural networks (CNNs) capture local tissue features but tend to ignore more global context cues; on the other hand, Vision Transformers are data-hungry and sensitive to staining variations. We provide a systematic, controlled comparison, and propose a hierarchical Swin Transformer framework designed to leverage both local and global representations via adaptive channel recalibration and attention-based feature aggregation on RoI images. Class-balanced upsampling helps further improve robustness against uneven distribution of samples. Evaluations on the BRACS dataset demonstrate performance gains of 7-10 % in the accuracy and F1 score compared to strong CNN and ViT baselines. We assessed multiple explainability techniques to maintain clinical transparency and found that the model highlights tissue regions that are diagnostically meaningful. The proposed framework strikes a good balance between predictive performance and interpretability for computer-aided breast cancer diagnosis.

Explainable Hierarchical Swin Transformer for Multi-Scale Breast Cancer Histopathology Classification

Movahedkor, Narges;Shahbazian, Reza;Trubitsyna, Irina
2026-01-01

Abstract

Accurate and transparent classification of breast cancer histopathology remains a major challenge due to morphological variability, class imbalance, and computational constraints in whole-slide image analysis. Convolutional neural networks (CNNs) capture local tissue features but tend to ignore more global context cues; on the other hand, Vision Transformers are data-hungry and sensitive to staining variations. We provide a systematic, controlled comparison, and propose a hierarchical Swin Transformer framework designed to leverage both local and global representations via adaptive channel recalibration and attention-based feature aggregation on RoI images. Class-balanced upsampling helps further improve robustness against uneven distribution of samples. Evaluations on the BRACS dataset demonstrate performance gains of 7-10 % in the accuracy and F1 score compared to strong CNN and ViT baselines. We assessed multiple explainability techniques to maintain clinical transparency and found that the model highlights tissue regions that are diagnostically meaningful. The proposed framework strikes a good balance between predictive performance and interpretability for computer-aided breast cancer diagnosis.
2026
9781643686615
Breast Cancer Classification
Multi-Scale Attention
Swin Transformer
Transfer Learning
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/407897
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact