Reconstruction error-based anomaly detection with few outlying examples

IRIS

Reconstruction error-based neural architectures constitute a classical deep learning approach to anomaly detec tion which has shown great performances. It consists in training an Autoencoder to reconstruct a set of examples deemed to represent the normality and then to point out as anomalies those data that show a sufficiently large reconstruction error. Unfortunately, these architectures often become able to well reconstruct also the anomalies in the data. This phenomenon is more evident when there are anomalies in the training set. In particular, when these anomalies are labeled, a setting called semi-supervised, the best way to train Autoencoders is to ignore anomalies and minimize the reconstruction error on normal data. When a sufficiently large and representative set of anomalous examples is available, the problem essentially shifts toward a classification task, where standard supervised strategies can be applied effectively. In this work, instead, we focus on the more challenging scenario in which only a limited number of anomalous examples is available, and these examples are not sufficiently representative of the wide variability that anomalies may exhibit. We propose AE-SAD, a novel reconstruction error-based architecture that explicitly leverages labeled anoma lies to guide the model. Our method introduces a new loss formulation that forces anomalies to be reconstructed according to a transformation function, effectively pushing them outside the description of normal data. This strategy increases the separation between the reconstruction errors of normal and anomalous samples, thereby improving the detection of both seen and unseen anomalies. Extensive experiments demonstrate that AE-SAD consistently outperforms both standard Autoencoders and the most competitive deep learning techniques for semi-supervised anomaly detection, achieving state-of-the-art results. In particular, our method proves superior across a diverse set of benchmarks, including vectorial data, high-dimensional datasets, and image domains. Moreover, AE-SAD maintains its advantage even in challeng ing scenarios where the training data are polluted by anomalies that are incorrectly labeled as normal, further highlighting its robustness and practical applicability.

Reconstruction error-based anomaly detection with few outlying examples

Angiulli F.;Fassetti F.;Ferragina L.

2026-01-01

Abstract

Reconstruction error-based neural architectures constitute a classical deep learning approach to anomaly detec tion which has shown great performances. It consists in training an Autoencoder to reconstruct a set of examples deemed to represent the normality and then to point out as anomalies those data that show a sufficiently large reconstruction error. Unfortunately, these architectures often become able to well reconstruct also the anomalies in the data. This phenomenon is more evident when there are anomalies in the training set. In particular, when these anomalies are labeled, a setting called semi-supervised, the best way to train Autoencoders is to ignore anomalies and minimize the reconstruction error on normal data. When a sufficiently large and representative set of anomalous examples is available, the problem essentially shifts toward a classification task, where standard supervised strategies can be applied effectively. In this work, instead, we focus on the more challenging scenario in which only a limited number of anomalous examples is available, and these examples are not sufficiently representative of the wide variability that anomalies may exhibit. We propose AE-SAD, a novel reconstruction error-based architecture that explicitly leverages labeled anoma lies to guide the model. Our method introduces a new loss formulation that forces anomalies to be reconstructed according to a transformation function, effectively pushing them outside the description of normal data. This strategy increases the separation between the reconstruction errors of normal and anomalous samples, thereby improving the detection of both seen and unseen anomalies. Extensive experiments demonstrate that AE-SAD consistently outperforms both standard Autoencoders and the most competitive deep learning techniques for semi-supervised anomaly detection, achieving state-of-the-art results. In particular, our method proves superior across a diverse set of benchmarks, including vectorial data, high-dimensional datasets, and image domains. Moreover, AE-SAD maintains its advantage even in challeng ing scenarios where the training data are polluted by anomalies that are incorrectly labeled as normal, further highlighting its robustness and practical applicability.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno
	
				2026
			
	Parole chiave
	
				Anomaly detection
Autoencoders
Deep learning
			
	Appare nelle tipologie:
	
				1.1 Articolo in rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/399057

Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni

ND

0

1

social impact