The urgent need for new techniques for Big data analysis calls for a great deal of attention by both research and industry communities. Among the techniques that must be redesigned for big data analysis purposes, clustering plays a crucial role. Dealing with Big Data implies that information to be analyzed have size ranging from terabytes to petabytes of data, making the use of clustering algorithms quite challenging, due to their (relatively) high computational costs. In this paper we discuss how to tackle this problem and how to implement a clustering strategy suitable for big data and having a reasonable execution time. We focus our attention on hierarchical clustering, as this class of algorithms easily meet some constraints set by big data features, while allowing the use of the most efficient solution for data access in distributed environments.

Hierarchical big data clustering

Ianni M.;
2015-01-01

Abstract

The urgent need for new techniques for Big data analysis calls for a great deal of attention by both research and industry communities. Among the techniques that must be redesigned for big data analysis purposes, clustering plays a crucial role. Dealing with Big Data implies that information to be analyzed have size ranging from terabytes to petabytes of data, making the use of clustering algorithms quite challenging, due to their (relatively) high computational costs. In this paper we discuss how to tackle this problem and how to implement a clustering strategy suitable for big data and having a reasonable execution time. We focus our attention on hierarchical clustering, as this class of algorithms easily meet some constraints set by big data features, while allowing the use of the most efficient solution for data access in distributed environments.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/328614
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact