DDI-CDI and Other Standards (2025)

DDI-CDI and Other Standards (2025)

Other Standards/Mapping Deliverables Summary

Here is a quick summary of deliverables. We need to complete the first section (Luis - I was not sure what to write) and we may want to indicate where drafts of any relevant documents are to be found.

https://docs.google.com/document/d/1ZpMkjuoHgrg3W7iI-8gdJFdvKmG_lNC9G67WPkDO44o/edit?tab=t.0

Topic overview

This topic is a broad one, but it builds on several different on-going efforts from earlier Dagstuhl workshops and in the broader user community. Specific outputs should build on this progress, as appropriate. It is unlikely that all of the following can be addressed as specific deliverables, but some subset of these should be addressed during the workshop, and all should be considered as relevant inputs:

  • CDIF: This involves the combined use of  http://Schema.org , SKOS, DCAT, ODRL, PROV, and I-ADOPT expressed as JSON-LD. This is not a focus for this workshop, but should be considered when any of these standards are being combined with DDI-CDI. 

  • SDMX-DDI: There is an ongoing effort in the Modern Stats community to align SDMX and DDI-CDI. This topic has also come up in various implementations of DDI-CDI. Contributions to this work should be considered as outputs of this workshop, although coordination with the Modern Stats group will need to be considered.

  • DDI-to-DDI: Having published mappings between existing DDI metadata standards is important, and while some such mappings exist in various tools (the ones from Pascal Heus, Nectar, etc.) they have not been agreed or published as part of the documentation. This topic will include DDI-CDI to and from DDI  Codebook and DDI Lifecycle, with consideration of which syntax representations are covered.

  • Croissant ML: There has been some discussion of how good metadata impacts the use of generative AI in data-related topics. Central to many of these discussions is the Croissant ML standard. It should be possible to produce good Croissant documentation for the training of LLMs and other similar applications on the basis of DDI-documented data, and especially from DDI-CDI. Exploring these themes and documenting any findings would be the work of this group.

Shared Google Folder (Please use a google doc to create notes of your discussion)