Census data provide valuable insights on the economic, social and demographic conditions and trends occurring in a country. Census data is collected by means of millions of questionnaires, each one including the details of the persons living together in the same house. Before the data from the questionnaires is sent to the statisticians to be analysed, a cleaning phase (called "imputation") is performed, in order to eliminate consistency problems, missing answers, or errors. It is important that the imputation step is done without altering the statistical validity of the collected data. The contribution of this paper is twofold. On the one hand, it provides a clear and well-founded declarative semantics to questionnaires and to the imputation problem. On the other hand, a correct modular encoding of the problem in the disjunctive logic programming language DLPw, supported by the DLV system, is shown. It turns out that DLPw is very well-suited for this goal. Census data repair appears to be a challenging application area for disjunctive logic programming.
Census Data Repair: a challenging application of Disjunctive Logic Programming
LEONE, Nicola;SCARCELLO F.
2001-01-01
Abstract
Census data provide valuable insights on the economic, social and demographic conditions and trends occurring in a country. Census data is collected by means of millions of questionnaires, each one including the details of the persons living together in the same house. Before the data from the questionnaires is sent to the statisticians to be analysed, a cleaning phase (called "imputation") is performed, in order to eliminate consistency problems, missing answers, or errors. It is important that the imputation step is done without altering the statistical validity of the collected data. The contribution of this paper is twofold. On the one hand, it provides a clear and well-founded declarative semantics to questionnaires and to the imputation problem. On the other hand, a correct modular encoding of the problem in the disjunctive logic programming language DLPw, supported by the DLV system, is shown. It turns out that DLPw is very well-suited for this goal. Census data repair appears to be a challenging application area for disjunctive logic programming.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.