The widespread diffusion of text black-box classifiers necessitates explainable AI (XAI) techniques for this domain. A seminal XAI technique is Local Interpretable Model-agnostic Explanations (LIME). For text classification, LIME maps an input sentence and its neighbours into a bag of words, using a linear regressor as an interpretable model. However, this strategy has significant limitations. Neighbouring sentences are constructed solely by extracting subsets of the input sentence, which may fail to accurately capture the local decision boundary. Moreover, these subsets are not guaranteed to be representative of the classification classes, potentially leading to unbalanced or misleading interpretability. Additionally, such generated sentences might lack semantic coherence. Furthermore, the resulting explanation is often limited to confirming the relevance of a term or highlighting the impact of its removal, without providing deeper insights. This work tries to overcome these limitations by proposing LLiMean extension of LIME that exploits advances in Large Language Models (LLMs) to perform a classifier-driven generation of the neighbourhood. Our approach allows neighbours to employ a vocabulary larger than that of the input text. A generation procedure is introduced to more effectively capture the local decision boundary by ensuring generated samples span all classes involved in the classification. Additionally, an LLM-driven explanation and a counterfactual generation procedure are presented, returning the most relevant set of editing operations to influence the black-box predictor’s decision. Thus, the approach provides a richer, easier-to-interpret explanation and high-quality counterfactuals compared to standard LIME. Experiments on real datasets witness the technique’s effectiveness in providing suitable, relevant, and interpretable explanations.

LLiMe: enhancing text classifier explanations with large language models

Angiulli, Fabrizio;De Luca, Francesco
;
Fassetti, Fabio;Nistico', Simona
2025-01-01

Abstract

The widespread diffusion of text black-box classifiers necessitates explainable AI (XAI) techniques for this domain. A seminal XAI technique is Local Interpretable Model-agnostic Explanations (LIME). For text classification, LIME maps an input sentence and its neighbours into a bag of words, using a linear regressor as an interpretable model. However, this strategy has significant limitations. Neighbouring sentences are constructed solely by extracting subsets of the input sentence, which may fail to accurately capture the local decision boundary. Moreover, these subsets are not guaranteed to be representative of the classification classes, potentially leading to unbalanced or misleading interpretability. Additionally, such generated sentences might lack semantic coherence. Furthermore, the resulting explanation is often limited to confirming the relevance of a term or highlighting the impact of its removal, without providing deeper insights. This work tries to overcome these limitations by proposing LLiMean extension of LIME that exploits advances in Large Language Models (LLMs) to perform a classifier-driven generation of the neighbourhood. Our approach allows neighbours to employ a vocabulary larger than that of the input text. A generation procedure is introduced to more effectively capture the local decision boundary by ensuring generated samples span all classes involved in the classification. Additionally, an LLM-driven explanation and a counterfactual generation procedure are presented, returning the most relevant set of editing operations to influence the black-box predictor’s decision. Thus, the approach provides a richer, easier-to-interpret explanation and high-quality counterfactuals compared to standard LIME. Experiments on real datasets witness the technique’s effectiveness in providing suitable, relevant, and interpretable explanations.
2025
Black-box explanation
Explainable AI
Large language models
Local interpretable explanation
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/394777
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact