Information on DDI and official statistics specifications

Information on DDI

Purpose, Coverage

The Data Documentation Initiative (DDI) is an international standard for describing statistical and social science data. Documenting data with DDI facilitates interpretation and understanding -- both by humans and computers. DDI describes data that result from observational methods in the social, behavioral, economic, and health sciences.

  • DDI Codebook is a more light-weight version of the standard, intended primarily to document simple survey data. Originally DTD-based, DDI-C is now available as an XML Schema.
    • Archives, codebooks, catalogues e,g, underpins Dataverse
  • DDI Lifecycle is designed to document and manage data across the entire life cycle, from conceptualization to data publication and analysis and beyond. It encompasses all of the DDI-Codebook specification and extends it. Based on XML Schemas, DDI-Lifecycle is modular and extensible. Full questionnaire structure is captured.
  • The Moving Forward Project (DDI 4) focuses on a model-based development enabling multiple bindings of  DDI like XML Schema and RDF-S/OWL. The model-based approach should it make easier to interoperate with other specifications. On the content side, the data life cycle will be more completely covered, and generalized approaches for data collection and data description are developed.
  • The DDI-RDF Discovery Vocabulary (Disco) specification is designed to support the discovery of microdata sets and related metadata using RDF technologies in the Web of Linked Data. The vocabulary leverages the DDI specification to create a simplified version of this model for the discovery of data files. It is based on a subset of the DDI XML formats of DDI Codebook and DDI Lifecycle. It supports identifying programmatically the relevant datasets for a specific research purpose. Existing DDI XML instances can be transformed into this RDF format and therefore exposed in the Web of Linked Data. The reverse process is not intended, as the developers of the RDF discovery vocabulary have defined DDI-RDF components and reused components of other RDF vocabularies which make sense only in the Linked Data field.
  • XKOS - Extended Knowledge Organization System - started as RDF expanding on content already in DDI-L and mapping to SKOS then extending to support statistical classification - reflecting this in Moving Forward work. XKOS leverages the Simple Knowledge Organization System (SKOS) for managing statistical classifications and concept management systems, since SKOS is widely used. LOD is used to create Web artifacts that machines can interpret, so publishing machine-readable statistical classifications and other concept management systems as SKOS instances is desired. XKOS extends SKOS for the needs of statistical classifications. It does so in two main directions. First, it defines a number of terms that enable the representation of statistical classifications with their structure and textual properties, as well as the relations between classifications. Second, it refines SKOS semantic properties to allow the use of more specific relations between concepts. Those specific relations can be used for the representation of classifications or for any other case where SKOS is employed.
  • Controlled vocabularies for usage with DDI.
  • Overview of what the work products of the DDI Alliance are, what purpose they serve, and how they are maintained.
  • DDI Alliance website

Official Statistics

  • GSBPM - Generic Statistical Business Process Model.
  • GLBPM - Generic Longitudinal Business Process Model. Extension of GSBPM for longitudinal data by DDI Alliance.
  • GSIM - Generic Statistical Information Model.