The problem of managing and querying inconsistent databases has been deeply investigated in the last few years. Most of the approaches proposed so far rely on the notion of repair (a minimal set of delete/insert operations making the database consistent) and consistent query answer (the answer to a query is given by considering the set of 'repaired' databases). Since the problem of consistent query answering is hard in the general case, most of the proposed techniques have an exponential complexity, although for special classes of constraints and queries the problem becomes polynomial. A second problem with most of the proposed approaches is that repairs do not take into account update operations (they consider delete and insert operations only). This paper presents a general framework where constraints consist of functional dependencies and queries may be expressed by positive relational algebra. The framework allows us to compute certain (i.e. tuples derivable from all or from none of the repaired databases) and uncertain query answers (i.e. tuples derivable from a proper not empty subset of the repaired databases). Each tuple in the answer is associated with a probability, which depends on the number of repaired databases from which the tuple can be derived. In the proposed framework, databases are repaired by means of update operations and repaired databases are stored by means of a "condensed" database, so that all the repaired databases can be derived by "expanding" the unique condensed database. A condensed database can be rewritten into a probabilistic database where each tuple is associated with an event (i.e. a boolean formula) and, thus, a probability value. The probabilistic query answer can be computed by querying the so obtained probabilistic database. As the complexity of querying probabilistic databases is #P-complete, approximate probabilistic answers which are computable in polynomial time are considered.

Approximate Probabilistic Query Answering over Inconsistent Databases

GRECO, Sergio;MOLINARO, Cristian
2008-01-01

Abstract

The problem of managing and querying inconsistent databases has been deeply investigated in the last few years. Most of the approaches proposed so far rely on the notion of repair (a minimal set of delete/insert operations making the database consistent) and consistent query answer (the answer to a query is given by considering the set of 'repaired' databases). Since the problem of consistent query answering is hard in the general case, most of the proposed techniques have an exponential complexity, although for special classes of constraints and queries the problem becomes polynomial. A second problem with most of the proposed approaches is that repairs do not take into account update operations (they consider delete and insert operations only). This paper presents a general framework where constraints consist of functional dependencies and queries may be expressed by positive relational algebra. The framework allows us to compute certain (i.e. tuples derivable from all or from none of the repaired databases) and uncertain query answers (i.e. tuples derivable from a proper not empty subset of the repaired databases). Each tuple in the answer is associated with a probability, which depends on the number of repaired databases from which the tuple can be derived. In the proposed framework, databases are repaired by means of update operations and repaired databases are stored by means of a "condensed" database, so that all the repaired databases can be derived by "expanding" the unique condensed database. A condensed database can be rewritten into a probabilistic database where each tuple is associated with an event (i.e. a boolean formula) and, thus, a probability value. The probabilistic query answer can be computed by querying the so obtained probabilistic database. As the complexity of querying probabilistic databases is #P-complete, approximate probabilistic answers which are computable in polynomial time are considered.
2008
Polynomial approximation ; Probability; Data repair
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.11770/171616
 Attenzione

Attenzione! I dati visualizzati non sono stati sottoposti a validazione da parte dell'ateneo

Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 22
  • ???jsp.display-item.citation.isi??? 16
social impact