The widespread utilization of Internet of Things (IoT) devices has resulted in an exponential increase in data at the Internet's edges. This trend, combined with the rapid growth of machine learning (ML) applications, necessitates the execution of learning tasks across the entire spectrum of computing resources - from the device, to the edge, to the cloud. This paper investigates the execution of machine learning algorithms within the edge-cloud continuum, focusing on their implications from a distributed computing perspective. We explore the integration of traditional ML algorithms, leveraging edge computing benefits such as low-latency processing and privacy preservation, along with cloud computing capabilities offering virtually limitless computational and storage resources. Our analysis offers insights into optimizing the execution of machine learning applications by decomposing them into smaller components and distributing these across processing nodes in edge-cloud architectures. By utilizing the Apache Spark framework, we define an efficient task allocation solution for distributing ML tasks across edge and cloud layers. Experiments on a clustering application in an edge-cloud setup confirm the effectiveness of our solution compared to highly centralized alternatives, in which cloud resources are extensively used for handling large volumes of data from IoT devices..
A Spark-based Task Allocation Solution for Machine Learning in the Edge-Cloud Continuum
Belcastro L.;Marozzo F.;Presta A.;Talia D.
2024-01-01
Abstract
The widespread utilization of Internet of Things (IoT) devices has resulted in an exponential increase in data at the Internet's edges. This trend, combined with the rapid growth of machine learning (ML) applications, necessitates the execution of learning tasks across the entire spectrum of computing resources - from the device, to the edge, to the cloud. This paper investigates the execution of machine learning algorithms within the edge-cloud continuum, focusing on their implications from a distributed computing perspective. We explore the integration of traditional ML algorithms, leveraging edge computing benefits such as low-latency processing and privacy preservation, along with cloud computing capabilities offering virtually limitless computational and storage resources. Our analysis offers insights into optimizing the execution of machine learning applications by decomposing them into smaller components and distributing these across processing nodes in edge-cloud architectures. By utilizing the Apache Spark framework, we define an efficient task allocation solution for distributing ML tasks across edge and cloud layers. Experiments on a clustering application in an edge-cloud setup confirm the effectiveness of our solution compared to highly centralized alternatives, in which cloud resources are extensively used for handling large volumes of data from IoT devices..I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.