Back to Top

Integrate data

Integrate data - This sub-process integrates data from one or more sources. The input data can be from a mixture of external or internal data sources, and a variety of collection modes, including extracts of administrative data. The result is a harmonized data set. Data integration typically includes:
• matching / record linkage routines, with the aim of linking data from different sources, where those data refer to the same unit;
• prioritising, when two or more sources contain data for the same variable (with potentially different values).
Data integration may take place at any point in this phase, before or after any of the other sub-processes. There may also be several instances of data integration in any statistical business process. Following integration, depending on data protection requirements, data may be anonymized, that is stripped of identifiers such as name and address, to help to protect confidentiality.