Big Data analysis refers to advanced and efficient data mining and machine learning techniques applied to large amount of data. Research work and results in the area of Big Data analysis are continuously rising, and more and more new and efficient architectures, programming models, systems, and data mining algorithms are proposed. Taking into account the most popular programming models for Big Data analysis (MapReduce, Directed Acyclic Graph, Message Passing, Bulk Synchronous Parallel, Workflow and SQL-like), we analysed the features of the main systems implementing them. Such systems are compared using four classification criteria (i.e. level of abstraction, type of parallelism, infrastructure scale and classes of applications) for helping developers and users to identify and select the best solution according to their skills, hardware availability, productivity and application needs.
Programming models and systems for Big Data analysis
Belcastro L.;Marozzo F.;Talia D.
2019-01-01
Abstract
Big Data analysis refers to advanced and efficient data mining and machine learning techniques applied to large amount of data. Research work and results in the area of Big Data analysis are continuously rising, and more and more new and efficient architectures, programming models, systems, and data mining algorithms are proposed. Taking into account the most popular programming models for Big Data analysis (MapReduce, Directed Acyclic Graph, Message Passing, Bulk Synchronous Parallel, Workflow and SQL-like), we analysed the features of the main systems implementing them. Such systems are compared using four classification criteria (i.e. level of abstraction, type of parallelism, infrastructure scale and classes of applications) for helping developers and users to identify and select the best solution according to their skills, hardware availability, productivity and application needs.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.