DDI Newsletter - Q1 2019

DDI Annual Meeting 

The Annual Meeting of Member Representatives will be held on Saturday, 1 June 2019, in Sydney, Australia (directly after the IASSIST Annual Conference, 27-31 May 2019, also to be held in Sydney).

The Annual Meeting of Member Representatives provides a forum for member discussion and feedback. Please save the date. We look forward to seeing many member representatives at the meeting.

Welcome New Member: Statistics Canada

We are excited to welcome our newest Full Member, Statistics Canada!

DDI 4 Review

This is a reminder that we are coming up to the end of the projected 3 month period for the DDI 4 Prototype Review Period. Please submit your issues and comments before Monday, February 4, 2019. 

DDI 4 - Comment and Review

EDDI 2018 Conference

The 10th Annual European DDI User Conference (EDDI18) was held December 4-5, 2018.  EDDI18 was organized jointly by SOEP - The German Socio-Economic Panel, GESIS - Leibniz Institute for the Social Sciences and IDSC of IZA - International Data Service Center of the Institute for the Study of Labor, and hosted by SOEP (The Socio-Economic Panel) at DIW (German Institute for Economic Research), Berlin, Germany.  The program is available on the conference website.

There were nearly XXXX participants from over XXXX organisations and XXXX countries. The conference was opened by the keynote speech on "Making Fair Data a Reality... and the Challenges of Interoperability and Reusability" by Simon Hodson (Executive Director of CODATA, the Committee on Data of the International Council for Science), and included 26 presentations, 2 tutorials, posters, discussion sessions, and a side meeting. Nearly all presentations and posters are available at https://zenodo.org/communities/edd18/.


A panel discussion, introduced by Jon Johnson (https://doi.org/10.5281/zenodo.2530104) covered the question of licenses for metadata publishing and how to credit metadata producers appropriately.

EDDI2019 will be hosted by the Finnish Social Science Data Archive in Tampere, Finland December 3-4, 2019.

Picture: Christina Kurka (DIW Berlin)

NADDI 2019 Conference

Early Registration is still available for the 2019 North American Data Documentation Initiative Conference (NADDI) being held at Statistics Canada on April 25-26, 2019. Workshops are being held on April 24th.

http://naddiconf.org/2019/

The theme for NADDI 2019, ‘Benefits of Describing Statistical Production and Variables’, emphasizes the benefits of using metadata to drive efficiencies in a research data lifecycle, as well as promotes subsequent re-use of end data products, especially those generated by federal and national statistical agencies.

EDDI 2019 Call for Papers

The call for papers is out.  Deadline...

DDI Scientific Board Election Results

The DDI Alliance recently held elections to fill the Vice-Chair position.  Ingo Barkow was elected Vice-Chair.

UNECE Supporting Standards Group

The DDI Alliance has participated in recent UNECE groups on supporting metadata standards.

Mari Kleemola participated in the Workshop on the Modernisation of Official Statistics, which was held in Geneva, Switzerland, on 27-28 November 2018.  The workshop was attended by representatives from 29 countries and organisations.  All documents and presentations are available from the Workshop wiki  (https://statswiki.unece.org/x/HAFNDQ).

Jay Greenfield has been participating in the Supporting Standards Group

Moving Forward: Berlin Report

Insert sprint final report.  EDDI 2018 Sprint Confluence page with the final report. 

Moving Forward: Ottawa

DDI 4 Core Sprint in the margins of the North American Data Documentation Initiative Conference, from 22 to 24 April 2019.

The Sprint was a meeting of the Modeling, Representation, Testing (MRT) working group. This group has been formed as part of the DDI Moving Forward project, and largely replaces the Modeling Team, but has a focus which includes not only modeling but also testing of the syntax representations and other work products.

In general terms, the Sprint was very successful – all of the anticipated deliverables were completed, and some topics for future work were discussed or explored, such that future progress will be more easily realized. In terms of the overall workplan, target milestones have been met, and in some cases exceeded. Significant work remains, but with the progress made at the Sprint, delivery of the DDI 4 Core at the end of December 2019, as a product ready for review and publication is still a realistic goal.


The DDI 4 Core was identified by MRT as a subset of the DDI 4 released in the Prototype Review package (see the DDI 4 Core Summary and Overview document). It is intended to be a production release of some of the most useful functionality supported by that model and associated products, narrowed in scope to make resource issues more tractable. Emphasis is on the foundational metadata, data description, and some applications of the process model.

The MRT Working Group has adopted a working process somewhat different from earlier DDI 4 projects: a more limited scope has been identified, and short-term timelines established. The core features of the existing DDI 4 model are to be finalized and the entire standards product (the model, documentation, and syntax representations/bindings) is to be ready for review as a production release by the end of 2019. The working process is an iterative one, more fully embracing the Agile methodology which has to a limited extent informed all of the DDI 4 work up to this point.

Central to the work is the existence of a production system which will allow modeling to become part of a cycle which also includes the production of documentation and bindings. This system did not exist in a useful form at the start of the work, and prior to this Sprint half of the group’s efforts have been focused on developing this critical infrastructure from the existing one (the TC production framework and the Lion Repository). The initial move off of the previous infrastructure was achieved at this Sprint, which is an important milestone in the overall working of the group, even if one which is not as visible in terms of the eventual standards product to be delivered.

One change from earlier production processes is the use of Canonical XMI as a format for describing the model. This format was agreed in discussions with the TC as offering several benefits. It serves as an exchange format for the UML model between the MRT and the TC, being designed as a portable format for such models. Further, it can also be used directly as a deliverable by users across a wide range of UML tools, a feature which is of increasing as implementers use DDI in new ways (e.g., not as XML or RDF, but as a model for analysis packages, repositories, and other systems).

In terms of the model content, the existing scope has been narrowed, but the substantial work of the past years forms the basis for the group’s current efforts. It is in essence a finalization and productization of the model and derived products, informed by the recent implementation and review of the DDI 4 Prototype. This input has indicated that changes are needed in both the style and content of the model and related products. Further, the work will need to be passed on to the TC at the point where it is ready for public review and distribution – the TC is ultimately the part of the DDI Alliance which will maintain it. Thus, alignment and integration with the TC production and management systems has been given a high priority in the work of the MRT.  

DDI 4 Core Overview and Scope

The work on producing the DDI 4 Core was launched so that, following the Prototype review, some of the core features of the DDI 4 work could be made ready for production release, recognizing that with available resources a more narrow scope was desirable. Emphasis was placed on short-term delivery: the Modeling, Representation, Testing (MRT) working group has allowed itself a year to complete the work on the initial core release, with delivery of a final deliverable ready for review and release at the end of December 2019.

DDI has always faced the requirement of dealing with a large range of data, both for archival purposes
and to provide support for the entire production lifecycle to large studies and statistical agencies. The
result of this work is a model which in many important respects is domain-independent. Recent
developments in the research world are placing a greater emphasis on cross-domain integration of data,
and data coming from new sources, some of it in unfamiliar forms (e.g., “big data,” social media, sensor
data, etc.). Social Science is no different from other domains – the DDI community is faced with a
requirement for a more flexible ability to describe and manage, now available in a wider variety of forms.
The DDI 4 Core is intended to provide useful functionality in response to this requirement. In
presentations, review comments, and discussions it has become apparent that some aspects of the DDI 4
model included in the DDI 4 Prototype are of especially high value. Identified features include the
conceptual aspects of variables and classifications, the datum-oriented description of data, and the use
of the process model to describe data lineage (the processing involved in the provenance of data). These
same features have been identified as of interest by participants at recent Dagstuhl workshops on the
subject of cross-domain data integration, with a further emphasis on alignment with external standards
and the use of a UML model as a primary deliverable.
The DDI 4 Core will include not only the XML and RDF syntax representations, but will also deliver the
UML from which they are derived in the form of Canonical XMI, a portable, tool-independent expression
of the model. This not only makes DDI available for representation in other syntaxes and systems, but
provides a stable basis for the maintenance of the model into the future.
User-oriented subsets of the DDI 4 model are provided by the inclusion of Functional Views, organized to
support the application of DDI to specific tasks. This approach was employed in the Prototype, and will
be carried forward into DDI 4 Core in a refined form, having both a technical and documentary
expression. In addition, high-level documentation aimed at introducing the model to adopters has been
added. Together, these should make the DDI 4 Core more adoptable and easier to approach.

Because of the use of DDI 4 Core for cross-domain integration, and for other purposes, some key
external standards have been selected as candidates for specific alignment (PROV-O for provenance,
GSBPM for process description, and DCAT for data discovery). Documentation of alignment with this
small set of selected standards will be part of the deliverable package. The use of existing RDF
vocabularies in the RDF syntax representation of the model is anticipated, as a needed feature of
alignment with standards/best practice in the Linked Data domain.
The idea that DDI 4 Core be re-branded to reflect its intended use has been discussed: DDI-Codebook
and DDI-Lifecycle have brands which reflect their intended use, while DDI 4 Core does not. A re-branding
would communicate to users the purpose of the new release, and minimize confusion as to which
version of the standard is best suited for their applications - currently, the use of the version number
indicates an erroneous relationship between versions 3 and 4 which is causing some confusion among
the potential users of the new standard release, as it did when earlier releases were referred to as “DDI
2” and “DDI 3”. (Suggestions have been along the lines of "DDI - Integration," "DDI - Cross-Domain," etc.
Identifying a better name for the DDI 4 Core will need further exploration and conversation with the
Marketing group and others.)

New DDI-compliant tool: web-based Questionnaire Design and Development Tool (QDDT)

New DDI-compliant tool: web-based Questionnaire Design and Development Tool (QDDT). With funding from the EU, QDDT replaces the current paper-based Word template used to document the European Social Survey's questionnaire design process. More info: https://github.com/DASISH/qddt-client/wiki

Sponsorships

ESS, ESRA, IASSIST