A parameter-free, fully-automatic approach to clustering highdimensional categorical data is proposed. The algorithm attempts to improve the overall quality of the whole partition and finds clusters in the data, whose number is naturally established on the basis of the inherent features of the underlying dataset, rather than being previously specified. Experiments on both synthetic and real data prove that the devised algorithm scales linearly and achieves nearly-optimal results in terms of compactness and separation.

Top-down parameter-free clustering of high-dimensional categorical data

Cesario E.;Ortale R.
2006

Abstract

A parameter-free, fully-automatic approach to clustering highdimensional categorical data is proposed. The algorithm attempts to improve the overall quality of the whole partition and finds clusters in the data, whose number is naturally established on the basis of the inherent features of the underlying dataset, rather than being previously specified. Experiments on both synthetic and real data prove that the devised algorithm scales linearly and achieves nearly-optimal results in terms of compactness and separation.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/20.500.11770/303523
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? ND
social impact