Collect and capture data

This phase collects all necessary data using different collection modes (including extractions from administrative and statistical registers and databases), and loads them into the appropriate data environment. It does not include any transformations of collected data, as these are addressed in phase 5 (Process). This phase is comprised of four sub-processes:

  • 4.1.Select sample - This sub-process establishes the framework and selects the sample for this iteration of the collection, as specified in sub-process 2.4 (Design frame and sample methodology). It also includes the coordination of samples between instances of the same statistical business process (e.g., to manage overlap or rotation), and between different processes using a common framework or register (e.g., to manage overlap or distribute response burden). Quality assurance, approval, and maintenance of the framework and selected sample are also undertaken in this sub-process, though maintenance of underlying registers, from which frameworks for several statistical business processes are drawn, is treated as a separate business process. The sampling aspect of this sub-process is not usually relevant for processes based entirely on the use of pre-existing data sources (e.g., administrative data), as such processes generally create frameworks from the available data and then follow a census approach.

    The purpose of the handbook is to include in one publication sample survey design issues for convenient referral by practicing national statisticians, researchers, and analysts involved in sample survey work and activities. Methodologically sound techniques that are grounded in statistical theory are presented, implying the use of probability sampling at each stage of the sample selection process.
  • 4.2.Set up collection - This sub-process ensures that people, processes, and technology are ready to collect data in all modes, as designed. The sub-process takes place over time, and includes strategy, planning, and training activities in preparation for the specific instance of the statistical business process. Where the process is repeated regularly, some or all of these activities may not be explicitly required for each iteration. For one-off and new processes, these activities can be lengthy. This sub-process includes:
    • preparing a collection strategy
    • training collection staff
    • ensuring collection resources are available e.g. laptops
    • configuring collection systems to request and receive data;
    • ensuring the security of data to be collected;
    • preparing collection instruments (e.g., printing questionnaires, populating them with existing data, loading questionnaires and data onto interviewers’ computers, etc.).
  • 4.3.Run collection - This sub-process is where the collection is implemented, with different collection instruments used to collect the data. It includes initial contact with providers and any subsequent follow-up or reminders. It records when and how providers were contacted and whether they have responded. This sub-process also includes management of the providers involved in the current collection, ensuring that the relationship between the statistical organization and data providers remains positive, and recording and responding to comments, queries, and complaints.
  • 4.4.Finalize collection (data capture) - This sub-process includes loading the collected data and metadata into a suitable electronic environment for further processing in phase 5 (Process). It may include automatic data take-on, e.g., using optical character recognition tools to extract data from paper questionnaires or converting the formats of data files received from other organizations. In cases where there is a physical data collection instrument that is not needed for further processing, such as a paper questionnaire, this sub-process manages the archiving of that material in conformance with the principles established in phase 8 (Archive).

    Survey Solutions is a Computer-Assisted Personal Interview technology developed by the World Bank. It assists governments, statistical offices and non-governmental organisations in conducting complex surveys with dynamic structures using tablet devices. The software can be tailored to the needs of the clients, allowing them to successfully complete simple and more sophisticated projects: from basic evaluation questionnaires to complicated multistage panel surveys. The software is offered free of charge, its development being co-financed by the World Bank, the Bill and Melinda Gates Foundation and the Food and Agriculture Organization of the United Nations. Surveys can be conducted on low-cost Android tablets.

    The Census and Survey Processing System (CSPro) is a public domain software package used by hundreds of organizations and tens of thousands of individuals for entering, editing, tabulating, and disseminating census and survey data. CSPro is user-friendly, yet powerful enough to handle the most complex applications. It can be used by a wide range of people, from non-technical staff assistants to senior demographers and programmers.