Comparing SDC Methods for Microdata on the Basis of Information Loss and Disclosure Risk
Submitted by admin on Fri, 11/23/2012
We present in this paper the first empirical comparison of SDC methods for microdata which encompasses both continuous and categorical microdata. Based on re-identification experiments, we try to optimize the tradeoff between information loss and disclosure risk. First, relevant SDC methods for continuous and categorical microdata are identified. Then generic information loss measures (not targeted to specific data uses) are defined, both in the continuous and the categorical case. Disclosure risk is assessed using empirical re-identification. Two approaches to empirical re-identification are used: Euclidean record linkage and probabilistic record linkage. The results of this comparison will be used to come up with better SDC for microdata in the recently started EU-funded project CASC.