In this paper, we present empirical and theoretical results on classificationtrees for randomized response data. We considered a dichotomous sensitive responsevariable with the true status intentionally misclassified by the respondents using rulesprescribed by a randomized response method. We assumed that classification trees aregrown using the Pearson chi-square test as a splitting criterion, and that the randomizedresponse data are analyzed using classification trees as if they were not perturbed.We proved that classification trees analyzing observed randomized response data andestimated true data have a one-to-one correspondence in terms of ranking the splittingvariables. This is illustrated using two real data sets.
A property of the CHAID partitioning method for dichotomous randomized response data and categorical predictors
PERRI, PIER FRANCESCO;
2012-01-01
Abstract
In this paper, we present empirical and theoretical results on classificationtrees for randomized response data. We considered a dichotomous sensitive responsevariable with the true status intentionally misclassified by the respondents using rulesprescribed by a randomized response method. We assumed that classification trees aregrown using the Pearson chi-square test as a splitting criterion, and that the randomizedresponse data are analyzed using classification trees as if they were not perturbed.We proved that classification trees analyzing observed randomized response data andestimated true data have a one-to-one correspondence in terms of ranking the splittingvariables. This is illustrated using two real data sets.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.