Clustering uncertain data has emerged as a challenging task in uncertain data management and mining. Thanks to a computational complexity advantage over other clustering paradigms, partitional clustering has been particularly studied and a number of algorithms have been developed. While existing proposals differ mainly in the notions of cluster centroid and clustering objective function, little attention has been given to an analysis of their characteristics and limits. In this work, we theoretically investigate major existing methods of partitional clustering, and alternatively propose a well-founded approach to clustering uncertain data based on a novel notion of cluster centroid. A cluster centroid is seen as an uncertain object defined in terms of a random variable whose realizations are derived based on all deterministic representations of the objects to be clustered. As demonstrated theoretically and experimentally, this allows for better representing a cluster of uncertain objects, thus supporting a consistently improved clustering performance while maintaining comparable efficiency with existing partitional clustering algorithms.
Uncertain Centroid based Partitional Clustering of Uncertain Data / Gullo, F; Tagarelli, Andrea. - In: PROCEEDINGS OF THE VLDB ENDOWMENT. - ISSN 2150-8097. - 5:7(2012), pp. 610-621.
Scheda prodotto non validato
Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo
Titolo: | Uncertain Centroid based Partitional Clustering of Uncertain Data |
Autori: | |
Data di pubblicazione: | 2012 |
Rivista: | |
Citazione: | Uncertain Centroid based Partitional Clustering of Uncertain Data / Gullo, F; Tagarelli, Andrea. - In: PROCEEDINGS OF THE VLDB ENDOWMENT. - ISSN 2150-8097. - 5:7(2012), pp. 610-621. |
Handle: | http://hdl.handle.net/20.500.11770/144427 |
Appare nelle tipologie: | 1.1 Articolo in rivista |