In this study, we deal with the problem of efficiently answering range queries over uncertain objects in a general metric space. In this study, an uncertain object is an object that always exists but its actual value is uncertain and modeled by a multivariate probability density function. As a major contribution, this is the first work providing an effective technique for indexing uncertain objects coming from general metric spaces. We generalize the reverse triangle inequality to the probabilistic setting in order to exploit it as a discard condition. Then, we introduce a novel pivot-based indexing technique, called UP-index, and show how it can be employed to speed up range query computation. Importantly, the candidate selection phase of our technique is able to noticeably reduce the set of candidates with little time requirements. Finally, we provide a criterion to measure the quality of a set of pivots and study the problem of selecting a good set of pivots according to the introduced criterion. We report some intractability results and then design an approximate algorithm with statistical guarantees for selecting pivots. Experimental results validate the effectiveness of the proposed approach and reveal that the introduced technique may be even preferable to indexing techniques specifically designed for the euclidean space.

Indexing Uncertain Data in General Metric Spaces

ANGIULLI, Fabrizio;FASSETTI, Fabio
2012

Abstract

In this study, we deal with the problem of efficiently answering range queries over uncertain objects in a general metric space. In this study, an uncertain object is an object that always exists but its actual value is uncertain and modeled by a multivariate probability density function. As a major contribution, this is the first work providing an effective technique for indexing uncertain objects coming from general metric spaces. We generalize the reverse triangle inequality to the probabilistic setting in order to exploit it as a discard condition. Then, we introduce a novel pivot-based indexing technique, called UP-index, and show how it can be employed to speed up range query computation. Importantly, the candidate selection phase of our technique is able to noticeably reduce the set of candidates with little time requirements. Finally, we provide a criterion to measure the quality of a set of pivots and study the problem of selecting a good set of pivots according to the introduced criterion. We report some intractability results and then design an approximate algorithm with statistical guarantees for selecting pivots. Experimental results validate the effectiveness of the proposed approach and reveal that the introduced technique may be even preferable to indexing techniques specifically designed for the euclidean space.
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/20.500.11770/133233
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 23
  • ???jsp.display-item.citation.isi??? 18
social impact