Implementation Languages Workshop 2022

Date

Time

Meeting Room

Purpose

Date

Time

Meeting Room

Purpose

Friday 2 December

9:00-17:00

Room K319, 1 St Thomas

Requirements Gathering / Pre-discussion

Monday 5 December

9:00-17:00

Room K319, 1 St Thomas

Workshop

Tuesday 6 December

9:00-17:00

Room CS16, 1 St Thomas

Workshop

Venue: https://goo.gl/maps/MMoNRbpQnLC6xFJx7

 

Aim

Identification and use of Implementation languages in the DDI Suite of products:

  • Identify priority implementation languages for DDI products (e.g. RDF, JSON, UML, XML, etc.)

  • Identify style options for implementation languages 

  • Mappings to produce syntax representations 

    • Moving from conceptual models to serialization

  • What aspects of implementation should be consistent

    • Document options, decisions, and reasoning

  • Provide guidance for variation from the agreed model

    • based on applied use of product

    • what needs to be noted and how (need a consistent expression of exceptions and reasons)

Outputs:

  • Documentation of implementation language decisions

  • Guidelines for implementing languages in various products

  • Plan for providing and testing multiple implementation languages for current products`

Background: 

The DDI Suite currently expresses its individual products in a number of different implementation languages.

Each product uses one or two of these languages and many are interested in expanding to multiple expressions. Current usage includes XML schema, UML XMI, RDF, and JSON.

The work of the Technical Committee moving the DDI Lifecycle to the COGS production tool will allow us to store content in a CSV format and export to multiple languages. DDI CDI is also working on adding implementation languages beginning with RDF.

Rather than work independently the Technical Committee and the DDI-CDI Working Group believe that it would be beneficial to explore options and provide guidelines to the DDI maintenance and development groups on the use of various features of these languages.

We need to determine if and where we need uniformity and how to inform users of differences in the implementation of individual products in different languages.

This work addresses the approved Scientific Board Work Plan Goals noted in the appendix.

Pre-discussion / Agenda setting (Friday)

Scope of products

DDI-Codebook, DDI-Lifecycle and DDI-CDI

Priority implementation languages

There has been a consensus over several previous discussions that we should have in addition to an XML presentation, RDF/OWL, JSON and UML model, other suggestions have been made. TC proposal for DDI-L is here: https://docs.google.com/document/d/11Is5WOxoRfFDDZTKncOh46zg-WDzr65Xoj_HHLomiqo/edit?pli=1#heading=h.b0fascer6bv9 A useful output from this discussion would be grid where for each product a proposal was made for Must, Desirable, Future.

Implementation

DDI-C

DDI-L

DDI-CDI

XML Instance/Fragment

M

M

M

JSON /OpenAPI

D

M

M

JSON-LD

 

 

M

RDF/OWL

D

M

M

UML Normative / XMI

D

M

M

 

 

 

 

Python

 

F

 

PhP

 

F

 

C#

 

D

 

Java

 

F

 

R

 

F

 

 

 

 

 

 

 

DDI-C backwards compatibility issue - decisions needed to allow other implementations to happen.

OO -

Alignment of implementations across products

Google doc of notes

Common Objects Summary: https://ddi-alliance.atlassian.net/l/cp/DxeYnTpi

  1. Class definitions of major objects

    1. Text overlap with Glossary Group

    2. Relationship to models e.g. ISO 11179, GSIM, Neuchatel

    3. Relationship to implementation standards DC, Schema.org

    4. See slides from Flavio:

    5. Identify the ‘major objects’

    6. Work on two examples as POC / template

      1. Concept and its uses

      2. Variable and its uses

      3. Classifications & code lists

      4. Physical & logical data structures

  2. Potential rationalisation of name spaces of ‘objects’

  3. Common representations of comparable patterns, groups of objects, structures examples being

    1. Identification / Definable type / Signification pattern / Collections / Group generic structure / Variable Cascade / Classifications / Value domains

    2. Other use cases

  4. Harmonization between products follows on from 1 & 2 above

    1. Does this make sense, if so in what way ? At what level is there a common structure in terms of basics and how they transfer between products.

A useful outcome of this discussion would be the scope and perhaps which areas would be more of a priority, low hanging fruit and / or specific pain points

What are the implications for different serialisations / implementation languages

Some areas previously raised include:

  • Nature of implementation languages - relaying features not expressed as explicit objects or specified relationships (links, references) - Model considerations (what is held in model and what is held elsewhere)

  • How do these vary due to implementation languages
    Consistent means of expressing them in various implementation languages

    • Relationships expressed in XML through hierarchy need to be expressed explicitly in languages like RDF

    • e.g.

      • ordering in RDF

      • constraints SHACL

  • Understand similarities and differences / are there blockers / unnecessary barriers

    • DDI-L and DDI-CDI RDF

Other items raised:

There should be a discussion topic to cover the content manager point of view as well as the content user

 

For discussion:

CDI URLs

Codebook ↔︎ CDI

Lifecycle ↔︎ CDI

CDI content → source metadata

Participants

Name

Organisation

 

 

Darren Bell

UKDA

 

 

Pierre-Antoine Champin 

University of Lyon

 

 

Franck Cotton

INSEE

 

 

Christophe Dzikowski

INSEE

 

 

Arofan Gregory

CODATA

 

 

Oliver Hopt

GESIS

 

 

Jon Johnson

CLOSER

 

 

Geneviève Michaud

SciencesPo

 

 

Olof Olsson

SND

 

 

Flavio Rizzolo (online)

StatsCan

 

 

Dan Smith

Colectica

 

 

Romain Tailhurat

INSEE

 

 

Wendy Thomas

IPUMS

 

 

Tom Villette

SciencesPo