2024 Evaluating and Refining Cross-Domain Metadata Exchange Frameworks
Topics
Data Access
This group will look at producing examples of how access can be described using ODRL, and the kind of controlled vocabularies which are needed to support interoperability among communities of users and in cross-domain scenarios. Existing vocabularies (such as the Data Privacy Vocabulary) should be considered, and gaps identified. New vocabularies needed should be identified.
Reference (from Herve/Darren): https://zenodo.org/records/12761478
Discovery
This topic will focus on how the existing Discovery Core profile of CDIF can be used in combination with other profiles, including Data Description and Access, and potential overlap with Provenance information. The mapping between PROV and http://Schema.org should be considered. Outputs should include documented examples.
Semantic and Syntactic Mapping
The expression of semantic mappings and their use in performing transformations of data for integration purposes is an important topic for the use of FAIR data. This group will explore the way in which mappings can be described for reuse, with an emphasis on machine-actionability. The goal here is to make recommendations aligned with those coming from the RDA group on FAIR mappings and other related work, and to align with other recommendations around data description and integration in CDIF.
Data Description
The basic core profile presented in the current CDIF draft is limited in scope. This group will further explore how recommendations can be made to extend the metadata profile to better support the description of data integration. There is a strong overlap here with mapping, discovery, and provenance metadata, and these connections should be explored through worked examples (based on the SDG Indicators, etc.)
Provenance: Context
In order to understand the context of data across domain boundaries, there are a number of contextual factors which require description. A shared understanding of how observed events, collection or generation of data, samples, processing, and archival and dissemination practices is needed to fully describe data for the purposes of reuse. This group will build on work done at the 2023 Dagstuhl workshop to draft a common model for these aspects of provenance and the context of data. (See discussion draft of provenance framework: https://docs.google.com/document/d/1WLkXrcVmd_yNTWcs_OZYxAMqBssWoM6vEtb4MOuP8zA/edit?usp=drive_link )
Provenance: Reproducibility and Process Description
The description of provenance is critical for supporting the reproducibility of findings. This group will consider the various approaches being advocated in this area (containerization, description of experiments, processing descriptions, etc.) and document the different approaches. Draft recommendations for how this aspect of provenance should be addressed in CDIF will be the goal of this work. A focus of this activity will be alignment with the work in the Context group.
Workshop Summary (TBD after workshop)
Date and Location
The workshop takes place at Schloss Dagstuhl – Leibniz Center for Informatics on October 13 to October 18, 2024. See also the corresponding Dagstuhl web page and its information on COVID-19.
See the separate page for practical information.
Workshop Schedule
See the separate page for practical information.
Organizers and Participants
Organizers
Michelle Edwards, University of Guelph - Canada
Arofan Gregory, CODATA and DDI Alliance - USA
Simon Hodson, CODATA - Committee on Data of the International Science Council (ISC) - France
Steven McEachern, UK Data Service, University of Essex and DDI Alliance
Hilde Orten, Sikt – Norwegian Agency for Shared Services in Education and Research and DDI Alliance
Joachim Wackerow, Independent Expert - Germany
Participants
Darren Bell, UK Data Archive
Tathagata Bhattacharjee, LSHTM (The London School of Hygiene & Tropical Medicine)
Ian Bruno, CCDC (Cambridge Crystallographic Data Centre)
Simon Cox, OGC (Open Geospatial Consortium)
Mark Dietrich, EGI (European Grid Infrastructure)
Doug Fils, Independent expert
Heike Gorzig, HZB (Helmholtz-Zentrum Berlin)
Alexandra Kokkinaki, NOC (UK National Oceanography Centre)
Polina Koroleva, UNEP (UN Environment Programme)
Yann Le Franc, eScience Factory
Kerstin Lehnert, Columbia University
Iseult Lynch, University of Birmingham
Lauren Maxwell, University of Heidelberg
Luis Gonzalez Morales, UN Statistics Division
Michael Ochola, APHRC (African Population and Health Research Center)
Dennis Richard, University of Copenhagen
Steve Richard, Independent expert
Flavio Rizzolo, Statistics Canada