(1) Modular approach: The goal is that specific modules can be used in a flexible way standalone, together with other DDI-CDI modules, or together with other specifications. The work will focus on identification of functional packages, defined function of packages, clear one-way dependencies between packages, separation between functional (core) packages/classes and supporting packages/classes.
Deliverable(s): detailed plan on the topic along the listed subitems, description of some related rules (as modeling guidelines)
Activities:
Identify target functions for modules and areas of functional overlap
Propose a set of packages to support functions as identified
Design a mechanism/pattern to be implemented for cross-package dependencies
Produce a draft model/example which implements the package structure and dependency mechanism (e.g., DCAT/Schema.org, PROV-O)
Document the approach
(2) Data structure components (toolkit): Review an approach for building new data structure types (in addition to the existing traditional wide/rectangular data, long [event] data, multi-dimensional data, and NoSQL/key-value data). Possible additional data structure types include graphs, text, any object in a “cell” (tables, text, binary objects, arrays of arrays, etc.)
Deliverable(s): related description and guide, formal description of additional data structure types
Activities: Describing Graphs
Identify the functional requirements of a structural DDI-CDI description of data expressed as a graph (what will the structural description be used for?)
Work with example(s) to propose extensions to the existing DDI-CDI data structure descriptions (NGSI-LD, CSV on the Web WG approaches, possibly DataCube)
Document proposal and examples
Activities: Describing Nested Arrays
Identify the functional requirements of a structural DDI-CDI description of data expressed as a set of nested arrays (what will the structural description be used for?) These datasets are often very large – what is the intended support? Subsets extracted using queries/services? Entire databases?
Activities: Documenting the “Toolkit”
Identify which features of the DDI-CDI model are used in describing new/hybrid data sets
Propose a methodology for combining these, based on some examples (taken from above but also including “hybrid” long/wide data, for example)
Document the methodology in a form which could be included in the specification in future
(3) UML class model interoperable subset (UCMIS): The strict use of UCMIS enables a robust model which can be imported in many UML tools and represented in object-oriented syntax representations. The focus here will be the relationship to other specifications (in the light of the modular approach) on the model level and syntax representation level. See documentation and spreadsheet of previously named “Practitioner's Subset for Data Modeling”.
Deliverable(s): detailed description of using the UML approach with Abstraction stereotypes for relating to classes of other specifications, description of how this gets realized on the level of syntax representations
Activities:
Review the existing draft and evaluate suitability for publication as a stand-alone work product
Create examples of how Abstraction stereotypes can be used to relate to other classes in external specifications, and document the approach
Create examples of how UCMIS binding into specific syntax representations can be expressed – document the approach
(4) Syntax representations of the model: Exploration and decisions on OWL/RDF-S, JSON-LD, SheX (as constraint language for RDF). The work will build on an existing mapping from UML to OWL/RDF-S.
Deliverable(s): documentation on decisions and mapping
Activities:
Identify requirements for syntax expression in RDF and related technologies, based on existing examples (Helmholtz, NGSI-LD (INTERSTAT, github), etc.)
Consider the existing draft and evaluate in terms of the identified requirements
Modify mapping to reflect the findings
(5) Implementation guides: Identify the methodology by which a community of users will specify how they will employ the model in their own implementations, such that they become more easily interoperable. Intersection with other machine-processible descriptions of data-sharing resources and methods within the community will be a focus.
Deliverable(s): design and document the methodology for defining community implementation guides and provide whatever tools/templates/examples might be useful.
Activities:
Identify the required functionality for Implementation Guides (e.g., specifying a subset of the model, indicating syntax expressions)
Develop practical approaches to the creation of IGs – how the analysis within a community can be conducted and documented
Draft documentation and templates for applying and publishing IGs
(6) I-ADOPT and DDI-CDI: The I-ADOPT specification provides a model of how data can be described with clusters of variables and captures information about the data which is not expressed in DDI-CDI. This activity is aimed at looking at the intersection of the two specifications and determining how they can best be used to solve real-world problems in cross-domain data sharing.
Deliverable(s): a mapping of DDI-CDI and I-ADOPT, with a documented example or examples showing how the two specifications can be combined to support cross-domain data-sharing requirements.
Activities:
Identify the use cases for which the combined use of DDI-CDI and I-ADOPT is appropriate
Develop required mapping between DDI-CDI and the I-ADOPT metamodel
Apply the model to example use cases
Document the mapping and examples