The execution logs of a business process have been recently exploited to extract classification models for discriminating “deviant” instances of the process - i.e. instances diverging from normal/desired outcomes (e.g., frauds, faults, SLA violations). Regarding all log traces as sequences of task labels, current solutions essentially map each trace onto a vector space where the features correspond to sequence-oriented patterns, and any standard classifier-induction method can be applied to separate the two classes of instances. An ensemble-learning approach was also recently proposed to combine multiple base learners trained on heterogenous pattern-based log views. However, as these approaches simply abstract each event into an activity symbol, they disregard all the non structural event data that are typically stored in real-life logs, and which may well help improve the detection of deviances. Moreover, the usefulness of deviance models could be enhanced by equipping each prediction with a confidence measure, allowing the analyst to focus on (or prioritize) more suspicious cases. To overcome these limitations, we propose a multi-view ensemble learning approach, which: (i) fully exploits the multi-dimensional nature of log events, with the help of a clustering-based trace abstraction method; and (ii) implements a context- and probability-aware stacking method for combining base models' predictions. Tests on a real-life log confirmed the validity of the approach, and its capability to achieve compelling performances w.r.t. state-of-the-art methods.
Scheda prodotto non validato
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo
Titolo: | A multi-view multi-dimensional ensemble learning approach to mining business process deviances |
Autori: | |
Data di pubblicazione: | 2016 |
Abstract: | The execution logs of a business process have been recently exploited to extract classification models for discriminating “deviant” instances of the process - i.e. instances diverging from normal/desired outcomes (e.g., frauds, faults, SLA violations). Regarding all log traces as sequences of task labels, current solutions essentially map each trace onto a vector space where the features correspond to sequence-oriented patterns, and any standard classifier-induction method can be applied to separate the two classes of instances. An ensemble-learning approach was also recently proposed to combine multiple base learners trained on heterogenous pattern-based log views. However, as these approaches simply abstract each event into an activity symbol, they disregard all the non structural event data that are typically stored in real-life logs, and which may well help improve the detection of deviances. Moreover, the usefulness of deviance models could be enhanced by equipping each prediction with a confidence measure, allowing the analyst to focus on (or prioritize) more suspicious cases. To overcome these limitations, we propose a multi-view ensemble learning approach, which: (i) fully exploits the multi-dimensional nature of log events, with the help of a clustering-based trace abstraction method; and (ii) implements a context- and probability-aware stacking method for combining base models' predictions. Tests on a real-life log confirmed the validity of the approach, and its capability to achieve compelling performances w.r.t. state-of-the-art methods. |
Handle: | http://hdl.handle.net/20.500.11770/312728 |
ISBN: | 9781509006199 |
Appare nelle tipologie: | 4.1 Contributo in Atti di convegno |