Standards are a vital tool enabling integration and semantic linking of data within and between disciplines. However, standards tend to get developed and adopted within disciplines or application domains with little consideration of cross-discipline requirements and technologies, so data integration can often only be easily achieved within and between closely allied fields. Addressing global scientific challenges that depend on cross-discipline integration remains difficult. The challenge is to make cross-discipline data integration a routine aspect of data-driven science.
Metadata support data discovery, selection, access and use, and are critical for data integration. Data from different sources/domains should be described in a way that cross-discipline discovery can detect and access the relevant data collections, and so that transformations and analyses can be automated. The use of cross-discipline data should become efficient, scalable and reproducible, enabling discipline-neutral data processing and analysis tools to be applied. Furthermore it would be possible to apply (meta-)data mining approaches and reasoning. In sum, new opportunities of insights and realization will develop.
A CODATA initiative on interdisciplinary data integration is seeking to explore these challenges and opportunities in relation to three specific case studies in interdisciplinary research into infectious disease outbreaks, disaster risk and resilient cities. These case studies provide a concrete focus for exploring the potential of interoperability and data integration through metadata alignment.
The workshop will build on a platform provided particularly by the following activities: (i) two previous workshops on DDI and interoperability with other specifications (1, 2), (ii) work to extend and refine DCAT by the W3C Dataset Exchange Working Group (DXWG), and (iii) the three detailed case studies and pilots from the CODATA initiative mentioned above. Metadata activities in the Research Data Alliance provide additional background and context.
There are several different areas where metadata comes into play:
The capability to express discoverable and structured metadata must be automatic and achieved as far as possible using tools that are familiar and in common use.
Areas of exploration and discussion will identify and describe following:
The output of the workshop will likely be reports and working documents on one or more of these topics.
The core objective of the workshop will be to investigate and advance alignment between the cross-disciplinary and domain-specific metadata standards, and to bridge from standards focusing on collection-level to variable-level metadata.
Metadata standards that may be considered include (detailed list see below):
Data transformations to prepare data for analysis may be described in machine-actionable form. DDI 4 uses some patterns of BPMN to achieve this, and CSV on the Web addresses transformation of tabular data into semantic form.
Additional relevant standards are likely to be uncovered during the development of the CODATA initiative.