Characteristics of a good catalog
What makes a good catalog?
The user point of view
- Complies with international metadata standard. International XML metadata standards such as the DDI and the Dublin Core considerably facilitate the production and maintenance of such catalogs.
- Is web-based to facilitate discovery.
- Provides rich metadata, including at the variable level. Survey catalogs become particularly relevant and powerful when the survey metadata provides not only a detailed description of the survey itself (with information on title, primary investigator, sampling, date of data collection, topics, geographic coverage, etc.), but also of each variable with information on variable name and label, categories, literal question, interviewer’s instructions, and definitions. A variable-level catalog can be established relatively easily, using the DDI metadata standard and IHSN tools, in particular the IHSN Microdata Management Toolkit and free NAtional Data Archive (NADA) application.
- Is searchable within all relevant fields of the study. Within the DDI framework, this means the catalog should be searchable within both the study (title, year, country, organization) and variable (variable name, variable label, variable value label) description fields. The catalog should provide user-friendly full text search functionalities.
- Provides clear information on the policies and procedures for accessing the data.
- Provides a list and direct access to reference materials (questionnaires, manuals, reports).
- Includes a “search by topic” compliant with a standard taxonomy of topics. To facilitate the exchange of information among catalogs, the data archive community has developed a thesaurus to describe the topics covered by the datasets listed in their respective catalogs. A thesaurus is a set of terms or concepts used to describe objects like datasets, variables, books, etc. The terms in a thesaurus are normally organized as a hierarchy, with broader terms being parents to narrower terms. Usually, a thesaurus will include parallel terms and synonyms, allowing users to find what they are looking for, even when they are not using the preferred terms. Many archives use a thesaurus when adding keywords at the study level or concepts at the variable level. The use of a thesaurus will encourage consistency by ensuring that the same terms are selected when describing identical objects. Moreover, if users have access to the thesaurus when searching for data, there is a greater chance that they will use terms and concepts that return the most relevant list of hits. An example of the use of a thesaurus is the catalog maintained by the Council of European Social Science Data Archives (CESSDA), an umbrella organization for social science data archives across Europe.
- Is capable of displaying the results of searches quickly, even in large catalogs. This implies an efficient indexing system.
- Provide a means to compare catalog items. This is useful in comparing variables in standardized surveys or surveys for which multiple versions of the same study have been uploaded.
- Provides easily visible information on access policies for each study. For example, it will indicate whether the microdata are available and, if so, provide clear instructions on how to obtain them.
- Provide good on-screen help for users.
- Provide a means to link catalog items to external web site resources as well as to allow the attaching of additional information, such as bibliographic references to publications that have used the study.
The administrator's point of view
- Provides a secure environment for storing and sharing data and metadata.
- Provides tools to manage the microdata access process. This ranges from automated approval for microdata with no access restrictions, to systems for managing and processing applications for which vetting is required before access is granted.
- Provides a solution for sharing public use files and licensed files.
- Provides a secure means for sharing microdata and documentation, thus increasing end-user access.
- Collects information on uses and users of the catalog: data downloaded, and, where required, the purpose for which data are used. Such records are useful for the sponsors of studies, as they provide a means to gauge the use of the microdata. Such records are also useful to users, as they ensure users are informed when new versions of the data are published or when changes are made to studies they have downloaded.