Dagstuhl Sprint, October 2016 (Week Two)


The DDI (Data Documentation Initiative) metadata standard, originally created in 1995 to document social science research data, has in recent years become relevant to new user groups, including the official statistics and medical research communities. In order to respond to these new users, DDI is developing a model-based specification (DDI Version 4) that can be expressed in XML Schema, RDF-S/OWL, relational database schema, and program languages. Such a data model will make it easier to interact with other disciplines and other standards, to understand the specification, to develop and maintain it in a consistent and structured way, and to enable software development that is less dependent on specific DDI versions.

Throughout the past year, content modeling teams have been working virtually to model DDI 4 to ensure that it can document a broad spectrum of data. This year’s Dagstuhl “sprint” will focus creating re-usable multi-purpose documentation, controlled vocabularies, complex data capture and description, and funding proposals.


This workshop will extend and build upon the progress made during the past 3 years of development. The sprint will focus on four main areas of work:

Re-usable Structured Documentation: As DDI releases a first version of the Codebook Functional View to the community, documentation and documentation structure becomes essential for its successful implementation. The work during this week, will concentrate on developing both high-level and field-level documentation to assist archives, libraries, and statistical agencies, migrating from an older version of DDI to DDI 4.

Integration of Data Capture: The integration of data capture into the full DDI Views model; real world examples which will be documented.

Validation of Data Description: Validation and quality assessment of the data description model using a range of real world examples.

Controlled Vocabularies: A closer working relationship with the Controlled Vocabularies Group since Controlled Vocabularies are used throughout the model.

Funding Opportunities: Explore opportunities and create re-usable documentation for potential funding proposals, scope is local, national and international.

Long-term Metadata Infrastructure: A long term plan for how the DDI Alliance fits into the larger social science research community.


  1. Develop a documentation structure which can be re-used for multiple purposes. Initial base content will be written that will support the needs of different user groups (i.e. technical perspective), applied usage in substantive content areas, etc. The modular form of the documentation will also provide the basis for training materials to be used by all DDI user communities.
  2. Profile of the DDI requirements for a tool to support the creation and maintenance of Controlled Vocabularies. A defined process for more closer integration of the Controlled Vocabularies into the overall production activities of DDI. A canonical format for Controlled Vocabulairies will be identified.
  3. The existing model on data capture will be integrated with other related parts of the overall model. Its utility will be validated against real world examples which will also serve as documentation.
  4. The data description model will be assessed against a range of real world examples, which will be documented to illustrate the use of the model.
  5. Building blocks for funding proposals will be written.
  6. A strategic vision for how DDI enables the social science research infrastructure will be created.

See also the Dagstuhl webpage for this event.

Documents for Sprint

  File Modified

ZIP Archive copenhagen-sprint-2015-2015-12-07.zip

Jul 05, 2016 by Jared Alan Lyle

PDF File copenhagen-sprint-2015-report.pdf

Jul 05, 2016 by Jared Alan Lyle

JPEG File statue-724871_1920.jpg

Jul 05, 2016 by Jared Alan Lyle

JPEG File Schloss_Nachts.jpg

Jul 05, 2016 by Jared Alan Lyle

Microsoft Word Document Introduction to DDI4 Logical Data Description v4a.docx

Oct 11, 2016 by Wendy Thomas

ZIP Archive DataDescriptionExamples.zip

Oct 14, 2016 by Michelle Edwards

Microsoft Word Document DDI Data Description Example of Use.docx

Oct 14, 2016 by Michelle Edwards

Microsoft Word Document DDI Data Capture.docx

Oct 14, 2016 by Michelle Edwards

Microsoft Powerpoint Presentation Intro_Dagstuhl2016_ME_wk2.pptx

Oct 15, 2016 by Michelle Edwards

Microsoft Powerpoint Presentation Intro_Dagstuhl2016_ME_wk2_v2.pptx

Oct 23, 2016 by Michelle Edwards

Microsoft Word Document Atlassian Tool Use.docx

Oct 25, 2016 by Wendy Thomas

Papers and Outputs from Sprint

Preparing for the Sprint

In preparation, please review the following documents:

Read TheseLink

Detailed outcomes of the workshop

Introduction to the DDI4 Logical Data Description Package

Document v.3

Document v.4

Atlassian Tool Use (Note this document may be updated periodically during the sprint)Document

Local Information

The Dagstuhl Sprint takes place 24-28 October 2016 at Schloss Dagstuhl.

See the separate page for practical information.

George Alter

University of Michigan
Population Studies Center

Long-term Metadata Infrastructure
Ingo BarkowHTW Chur, University of Applied SciencesFunding Proposals
Bill BlockCISER - Cornell Institute for Social and Economic ResearchFunding Proposals
Kerrin BorschewskiGESIS - Leibniz Institute for the Social SciencesControlled Vocabularies
Kelly Chatain

University of Michigan
Survey Research Center

Controlled Vocabularies/Data Capture
Michelle EdwardsCISER - Cornell Institute for Social and Economic Research

Data Description/Funding Proposals

Anne EtheridgeUKDS - UK Data ServiceControlled Vocabularies
Dan GillmanBLS - U.S. Bureau of Labor StatisticsData Description
Arofan GregoryAeon TechnologiesFunding Proposals/Data Description
Oliver HoptGESIS - Leibniz Institute for the Social SciencesStructured Re-usable Documentation
Larry Hoyle

University of Kansas
Institute for Policy & Social Research

Data Description
Sanda Ionescu

University of Michigan
ICPSR - Interuniversity Consortium for Political and Social Research

Controlled Vocabularies
Taina JääskeläinenFSD - Finnish Social Science Data ArchiveControlled Vocabularies
Jon Johnson

UCL Institute of Education
Centre for Longitudinal Studies

Structured Re-usable Documentation
Amber Leahey

University of Toronto

Structured Re-usable Documentation
Jared Lyle

University of Michigan
ICPSR - Interuniversity Consortium for Political and Social Research

Long-term Metadata Infrastructure
Katy McNeillUKDS - UK Data ServiceFunding Proposals
Katja MoilanenFSD - Finnish Social Science Data ArchiveLong-term Metadata Infrastructure
Barry Radler

University of Wisconsin-Madison
Institute on Aging

Data Capture
Dan SmithColecticaData Capture
Wendy Thomas

University of Minnesota
Minnesota Population Center

Data Capture/Controlled Vocabularies
Joachim WackerowGESIS - Leibniz Institute for the Social SciencesLong-term Metadata Infrastructure/Structured Re-usable Documentation
Knut Wenzig

DIW Berlin - German Institute for Economic Research
SOEP - German Socio-Economic Panel

Data Description
Wolfgang Zenk-MöltgenGESIS - Leibniz Institute for the Social SciencesData Capture