Reconstruction error-based neural architectures constitute a classical deep learning approach to anomaly detec tion which has shown great performances. It consists in training an Autoencoder to reconstruct a set of examples deemed to represent the normality and then to point out as anomalies those data that show a sufficiently large reconstruction error. Unfortunately, these architectures often become able to well reconstruct also the anomalies in the data. This phenomenon is more evident when there are anomalies in the training set. In particular, when these anomalies are labeled, a setting called semi-supervised, the best way to train Autoencoders is to ignore anomalies and minimize the reconstruction error on normal data. When a sufficiently large and representative set of anomalous examples is available, the problem essentially shifts toward a classification task, where standard supervised strategies can be applied effectively. In this work, instead, we focus on the more challenging scenario in which only a limited number of anomalous examples is available, and these examples are not sufficiently representative of the wide variability that anomalies may exhibit. We propose AE-SAD, a novel reconstruction error-based architecture that explicitly leverages labeled anoma lies to guide the model. Our method introduces a new loss formulation that forces anomalies to be reconstructed according to a transformation function, effectively pushing them outside the description of normal data. This strategy increases the separation between the reconstruction errors of normal and anomalous samples, thereby improving the detection of both seen and unseen anomalies. Extensive experiments demonstrate that AE-SAD consistently outperforms both standard Autoencoders and the most competitive deep learning techniques for semi-supervised anomaly detection, achieving state-of-the-art results. In particular, our method proves superior across a diverse set of benchmarks, including vectorial data, high-dimensional datasets, and image domains. Moreover, AE-SAD maintains its advantage even in challeng ing scenarios where the training data are polluted by anomalies that are incorrectly labeled as normal, further highlighting its robustness and practical applicability.
Reconstruction error-based anomaly detection with few outlying examples
Angiulli F.;Fassetti F.;Ferragina L.
2026-01-01
Abstract
Reconstruction error-based neural architectures constitute a classical deep learning approach to anomaly detec tion which has shown great performances. It consists in training an Autoencoder to reconstruct a set of examples deemed to represent the normality and then to point out as anomalies those data that show a sufficiently large reconstruction error. Unfortunately, these architectures often become able to well reconstruct also the anomalies in the data. This phenomenon is more evident when there are anomalies in the training set. In particular, when these anomalies are labeled, a setting called semi-supervised, the best way to train Autoencoders is to ignore anomalies and minimize the reconstruction error on normal data. When a sufficiently large and representative set of anomalous examples is available, the problem essentially shifts toward a classification task, where standard supervised strategies can be applied effectively. In this work, instead, we focus on the more challenging scenario in which only a limited number of anomalous examples is available, and these examples are not sufficiently representative of the wide variability that anomalies may exhibit. We propose AE-SAD, a novel reconstruction error-based architecture that explicitly leverages labeled anoma lies to guide the model. Our method introduces a new loss formulation that forces anomalies to be reconstructed according to a transformation function, effectively pushing them outside the description of normal data. This strategy increases the separation between the reconstruction errors of normal and anomalous samples, thereby improving the detection of both seen and unseen anomalies. Extensive experiments demonstrate that AE-SAD consistently outperforms both standard Autoencoders and the most competitive deep learning techniques for semi-supervised anomaly detection, achieving state-of-the-art results. In particular, our method proves superior across a diverse set of benchmarks, including vectorial data, high-dimensional datasets, and image domains. Moreover, AE-SAD maintains its advantage even in challeng ing scenarios where the training data are polluted by anomalies that are incorrectly labeled as normal, further highlighting its robustness and practical applicability.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.


