Back to Top

Microdata anonymization

Statistical agencies and other data producers are increasingly publishing microdata obtained from sample surveys, censuses, and administrative data collection systems. The dissemination of microdata is made necessary by a high demand from the research community, a push for transparency, and sometimes by legal or contractual obligations. This must be done in such a way that the confidentiality of the information provided by respondents is preserved.

In this section we present:

Links to available tools are also provided, as well as a compilation of practices.

Anonymization is typically required for the production of public use files, and to a lesser extent, for generating licensed files. But anonymization is only one of many solutions to minimize the risk of disclosure when distributing microdata. Other legal and organizational measures contribute to this endeavor as well. For datasets provided to selected bona fide users, the legal agreement may include a higher level of security than anonymization alone (see the section on formulating a data dissemination policy).

Three guides have been produced:

  • Theory Guide - which provides an overview of common methods as well as of the SDC process,
  • Practice Guide - which describes how to apply methods using the command line interface for the R package sdcMicro and,
  • Manual for sdcApp - a graphic user interface for sdcMicro for users not comfortable using R from the command line.

This guide provides and introduction to the theory of Statistical Disclosure Control (SDC) for microdata. It includes an overview of the most commonly applied methods in SDC, a step-by-step overview of the complete SDC process and many examples from practice in National Statistics Offices (NSOs).
For guidance on the technical implementation of the theory mentioned in the guide, please refer to our guides:
- Statistical Disclosure Control for Microdata: A Practice Guide for guidance on the application of methods and on using sdcMicro from the command-line
- sdcApp manual for guidance on the application of methods and on using the GUI sdcApp available for sdcMicro

Download

Releasing data in a safe way is required to protect the integrity of the statistical system, by ensuring agencies honor their commitment to respondents to protect their identity. Agencies do not widely share, in substantial detail, their knowledge and experience using SDC and the processes for creating safe data with other agencies. This makes it difficult for agencies new to the process to implement solutions. We consolidated knowledge from literature as well as from our own experience to inform our discussion of the processes and methods presented in this guide. This guide focuses on the implementation of methods and uses the free R based package sdcMicro for its examples. If you are interested in reading in detail about the theory behind the methods used, we suggest reading our accompanying guide: Statistical Disclosure Control for Microdata: Theory.

Download

This is documentation and guidance for using sdcApp, a graphic user interface for the sdcMicro R package. sdcMicro provides tools for Statistical Disclosure Control (SDC) for microdata, also known as microdata anonymization. For an overview of the theory of SDC for microdata we suggest reading: Statistical Disclosure Control for Microdata: A Theory Guide.

Download

As well as an IHSN Working paper:

This guide, Introduction to Statistical Disclosure Control (SDC), discusses common SDC methods for microdata obtained from sample surveys, censuses and administrative sources.

Download