The Internet of Things (IoT) enables the interconnection of new cyber-physical devices which generate significant traffic of distributed, heterogeneous and dynamic data at the network edge. Since several IoT applications demand for short response times (e.g., industrial applications, emergency management, real-time systems) and, at the same time, rely on resource-constrained devices, the adoption of traditional Data Mining techniques is neither effective nor efficient. Therefore, conventional Data Mining techniques need to be adjusted for optimizing response times, energy consumption and data traffic while still providing adequate accuracy as required by the IoT application. In this paper, new Data Mining approaches particularly tailored for the IoT scenario have been investigated, in particular with respect to the promising, emerging novel distributed computing paradigm of Edge Computing. In detail, two approximated versions of K-Means clustering algorithm, centralized and distributed, have been implemented in the EdgeCloudSim simulation framework and validated on a real system. As highlighted by the algorithm performance analysis, choosing an approximated and distributed clustering solution can provide benefits in terms of computation, communication and energy consumption, while maintaining high levels of accuracy. The management of such trade-off, obviously, has to be done in the light of the specific IoT application requirements.

Data mining at the IoT edge

Savaglio C.;Fortino G.
2019-01-01

Abstract

The Internet of Things (IoT) enables the interconnection of new cyber-physical devices which generate significant traffic of distributed, heterogeneous and dynamic data at the network edge. Since several IoT applications demand for short response times (e.g., industrial applications, emergency management, real-time systems) and, at the same time, rely on resource-constrained devices, the adoption of traditional Data Mining techniques is neither effective nor efficient. Therefore, conventional Data Mining techniques need to be adjusted for optimizing response times, energy consumption and data traffic while still providing adequate accuracy as required by the IoT application. In this paper, new Data Mining approaches particularly tailored for the IoT scenario have been investigated, in particular with respect to the promising, emerging novel distributed computing paradigm of Edge Computing. In detail, two approximated versions of K-Means clustering algorithm, centralized and distributed, have been implemented in the EdgeCloudSim simulation framework and validated on a real system. As highlighted by the algorithm performance analysis, choosing an approximated and distributed clustering solution can provide benefits in terms of computation, communication and energy consumption, while maintaining high levels of accuracy. The management of such trade-off, obviously, has to be done in the light of the specific IoT application requirements.
2019
978-1-7281-1856-7
Clustering; Distributed Data Mining; Edge Computing; Internet of Things
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/299257
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 35
  • ???jsp.display-item.citation.isi??? 2
social impact