In this work we deal with the problem of detecting and ex-plaining exceptional behaving values in categorical datasets by perceiv-ing an attribute value as anomalous if its frequency occurrence is ex-ceptionally typical or un-typical within the distribution of frequencies occurrences of any other attribute value. The notion of frequency occur-rence is provided by specialising the Kernel Density Estimation method to the domain of frequency values and an outlierness measure is de fined by leveraging the cdf of such a density. This measure is able to simulta-neously identify two kinds of anomalies called lower outliers and upper outliers, namely exceptionally low or high frequent values. Moreover, data values labeled as outliers come with an interpretable explanations for their abnormality, which is a desirable feature of any knowledge discovery technique.

Detecting and Explaining Exceptional Values in Categorical Data

Angiulli F.;Fassetti F.;Palopoli L.;Serrao C.
2020-01-01

Abstract

In this work we deal with the problem of detecting and ex-plaining exceptional behaving values in categorical datasets by perceiv-ing an attribute value as anomalous if its frequency occurrence is ex-ceptionally typical or un-typical within the distribution of frequencies occurrences of any other attribute value. The notion of frequency occur-rence is provided by specialising the Kernel Density Estimation method to the domain of frequency values and an outlierness measure is de fined by leveraging the cdf of such a density. This measure is able to simulta-neously identify two kinds of anomalies called lower outliers and upper outliers, namely exceptionally low or high frequent values. Moreover, data values labeled as outliers come with an interpretable explanations for their abnormality, which is a desirable feature of any knowledge discovery technique.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/313085
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact