The need to compress data into synopses of summarized information often arises in many application scenarios, where the aim is to retrieve aggregate data efficiently, possibly trading off the computational efficiency with the accuracy of the estimation. A widely used approach for summarizing multi-dimensional data is the histogram-based representation scheme, which consists in partitioning the data domain into a number of blocks (called buckets), and then storing summary information for each block. In this paper, a new histogram-based summarization technique which is very effective for multi-dimensional data is proposed. This technique exploits a multi-resolution organization of summary data, on which an efficient physical representation model is defined. The adoption of this representation model (based on a hierarchical organization of the buckets) enables some storage space to be saved w.r.t. traditional histograms, which can be invested to obtain finer grain blocks, thus approximating data with more detail. Experimental results show that our technique yields higher accuracy in retrieving aggregate information from the histogram w.r.t. traditional approaches (classical multi-dimensional histograms as well as other types of summarization technique). Copyright 2005 ACM.

Hierarchical binary histograms for summarizing multi-dimensional data

FURFARO F;SACCA', Domenico;
2005-01-01

Abstract

The need to compress data into synopses of summarized information often arises in many application scenarios, where the aim is to retrieve aggregate data efficiently, possibly trading off the computational efficiency with the accuracy of the estimation. A widely used approach for summarizing multi-dimensional data is the histogram-based representation scheme, which consists in partitioning the data domain into a number of blocks (called buckets), and then storing summary information for each block. In this paper, a new histogram-based summarization technique which is very effective for multi-dimensional data is proposed. This technique exploits a multi-resolution organization of summary data, on which an efficient physical representation model is defined. The adoption of this representation model (based on a hierarchical organization of the buckets) enables some storage space to be saved w.r.t. traditional histograms, which can be invested to obtain finer grain blocks, thus approximating data with more detail. Experimental results show that our technique yields higher accuracy in retrieving aggregate information from the histogram w.r.t. traditional approaches (classical multi-dimensional histograms as well as other types of summarization technique). Copyright 2005 ACM.
2005
1-58113-964-0
Histograms; Multi-dimensional data; Range queries; Software
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/182851
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 9
  • ???jsp.display-item.citation.isi??? ND
social impact