Relevant standards guidance
Introduction
During the development of DDI 4, an international standard for the description of statistical and social science data, a need was identified for guidance on implementing DDI 4 with existing data and metadata standards. The 4th version of DDI is designed to complement existing data and metadata standards and should preclude the recreation of objects, elements, and other types of information from existing standards.
Purpose
This document contains guidelines for content modellers and others interested in referencing and integrating standards with DDI 4 to support metadata and data interoperability, and add flexibility to implementations of DDI. The 4th version of DDI is model-based allowing for interoperability at the conceptual level, as well as at the element level.
Description The guidance in this document consists of general information on the use of other standards with DDI and instructions on how to develop profiles, as well as several examples of how standards can be used with DDI. The guidance is based on ISO/IEC TR 10000:1998(E) Framework and Taxonomy of International Standardized Profiles.
DDI 4 and other standards
DDI 4 can be used with existing standards to support the research data lifecycle/statistical business process. See Appendix A for descriptions of standards that can be used with DDI.
When combining standards, it is recommended that modellers create "profiles", which are specifications for the use of one or more standards. A profile can contain one or more profiles, for example a profile for data discovery can have profiles for DDI + SDMX/Data Cube and DDI + Dublin Core. The result is a master profile that can be implemented in whole or in part.
The ISO Technical Report refers to the International Standardized Profile (ISP), which is "[a]n internationally agreed-to, harmonized document which describes one or more profiles" (p. 1.) This guidance will not deal with ISPs specifically.
Approaches to combining standards
The three approaches to using other standards * with DDI 4 are:
- Insert or integrate a complete standard with DDI 4.
- Use a complementary set of elements from other standards.
- Select specific elements from other standards.
*Including locally-defined elements.
Interoperability and DDI
Standards identified for possible use with DDI are assessed to determine the extent to which use of the standards supports interoperability.
The two levels of interoperability for DDI 4 are:
- Model-level (conceptual model e.g. PROV)
- Element-level (unique elements or element structures)
Interoperability is measured at the conceptual level, for model-based and element-based standards (vocabularies), and not the label level; elements from different standards can bear the same name and be conceptually different. Interoperability for element-based standards is evaluated at the element level only.
Note: Evaluation of label-level interoperability is normally not advised, unless it can be certain that the same vocabulary has been implemented in the standards being considered.
Interoperability first entails that there be intersections between standards, i.e that they have objects that are connected in one or more relationships. Across the intersections, two standards are complementary when one lends specificity to the other. For example, the addition of certain Dublin Core elements to a DDI citation lends specificity to the DDI citation.
Here, at the element-level in DDI 3.2, a citation includes DDI elements and Dublin Core elements.
Another example: the substitution of certain GSIM objects with OWL-S objects turns a model that was conceptual and human-readable into a model that is actionable and machine-readable.
A second standard may be complementary to DDI when it fills a "gap". For example, SDMX adds missing elements to the DDI ncube; the ncube characterizes aggregate data such as an OLAP cube or a statistical dataset.
Two standards are said to be non-complementary if the parts of each described in the profile are overlapping. See "Duplication" and "False Friends" below.
Model and element mapping [EH: Needs input from modellers]
The critical part of evaluating interoperability between standards is conducting model and element mappings. Modellers can use their own tools and techniques for mappings, with the following principles:
Model mapping
- Go to source model and evaluate objects, definitions, and relationships for conformance. See "Conformance to DDI" for more information.
- Ensure that objects with same label or definition are not duplicated within a profile.
Element mapping
- Evaluate the elements in the standards to be used with DDI, in particular the element labels and definitions.
- Look for element groupings and splits, where an element in one standard is expressed as two elements in the other.
- Look for element hierarchy and nesting issues, as well as dependencies and constraints.
- Ensure that elements with same label or definition are not duplicated within a profile.
- Beware of "False friends": elements that have the same or similar labels or definitions, but are conceptually different.
If two or more standards are sufficiently interoperable, they can be defined in a profile. More information on the extent of interoperability will be discovered, and should be documented, during the development of a profile.
DDI Profiles
A profile is a specification on the use of a particular standard or group of standards to support interoperability for or within an application, function, community, or environment. DDI profiles are most likely to be created for an application, such as a statistical information system, or a function, e.g. data librarian.
A profile should also include at a minimum a DDI functional view, which is composed of DDI objects, and some part or all of an external standard, as well as locally-defined elements.
The scenarios of interoperation defined in a profile can take the form of one or more representations in which the interoperation is articulated. Representations can be included as appendices in a profile. A representation can take several forms including a UML model, an RDF specification, or an OWL specification. If the external standard has a UML model, the scenarios of interoperation are best described in a UML model. If the external standard has an RDF vocabulary, the scenarios of interoperation are best described in an RDF specification. Alternatively, OWL can be used to specify the scenarios of interoperation.
Profiles are also useful for identifying gaps between standards, which can be filled through subsequent versions of standards or new standards.
It is recommended to use existing profiles wherever possible, before creating new ones.
Developing a profile
The construction of the profile is guided by several principles:
- A profile restricts the choice of base standard options to the extent necessary to maximize the probability of interworking;
- However, it must not specify any requirements that would contradict or cause non-conformance to a base standard;
- But it may specify requirements that are more specific than those of the base standard.
Conformance to a profile, therefore, implies conformance to the base standards referenced in the profile. Conformance to the base standards, however, does not necessarily imply conformance to the profile..
When developing a profile, consider the following: (to be developed)
- A
- B
- C
- D
In the file analysis conformance is to the combination of standards, distinct from conformance to the base standards in isolation. This combination frequently takes a form distinct from the base standards and is characterized quite often by changes in relationships:
- Unconditional mandatory requirements in the base standard should remain mandatory in the profile.
- Unconditional options in base standards may remain optional or may be changed within the profile to become:
- mandatory:
- conditional, giving rise to different statuses dependent upon some appropriate condition;
- out of scope, if the option is not relevant to the scope of the profile - for example functional elements which are unused in the context of the profile;
- prohibited, if the use of the option is to be regarded as non-conformant behaviour within the context of the profile - this choice should only be used when really necessary, "out of scope" may often be more appropriate.
- If the conditions in the conditional requirements in the base standards can be fully evaluated in the context of the profile, then these requirements become unconditional mandatory requirements or unconditional options, or they become out of scope or prohibited. Otherwise the conditions remain conditional, with the appropriate, possibly partially, evaluated conditions.
A profile can include the following information:
- Purpose and audience of the profile
- Statement of scope, function, and purpose of a profile
- Rationale for using particular standard with DDI
- Scenarios of interoperation.
- Allowances for adapting definitions of elements (locally-defined?)
- Constraints on values that can be used with specific elements
- The rationale for creating local elements.
- Details on the customization of existing elements.
- High-level instructions and information on integrating the base standards
- Implementation guidance (limited because environments not knowable) for those who will be creating the metadata
- Where detailed instructions can be found in cases of complex implementations
- Instructions and constraints re: nesting, repeatable elements, etc.
- Sets described in machine-readable form
- Syntax instructions – e.g. Re: encodings
- Usage details for each of the base standards.
- A statement of conformance requirements
- An Implementation Conformance Statement, which states the extent to which the profile complies with the base standards.
See Appendix C for the recommended template for profiles.
Profiles should not repeat the text of the base standards/profiles to which they refer. Use of reference to base standards, templates, and registered names of objects are critical to the development of concise profiles.
References to other standards should be to complete sections or clauses, by name or number.
Conformance to DDI
Conformance is the extent to which an implementation of a standard complies with the requirements of a specification of that standard. The degrees of conformance are: full, partial, and non-conformant. The standard of conformance is declared in the statement of intent of a profile.
Implementation Conformance Statements
To evaluate the conformance of a particular implementation, it is necessary to have a statement of the capabilities that have been implemented in support of one or more specifications, specifically including the relevant optional capabilities and limits, so that the implementation can be tested for conformance to the relevant requirements, and only to those requirements. Such a statement is called an Implementation Conformance Statement (ICS).
- Conformance entails performance: a profile has specific capabilities and limits.
- A profile has a value: its capabilities and limits have a place in the research data life cycle/statistical business process (GSBPM or other).
- A profile contains specific evaluation guidance
An ICS is developed from the results of conformance testing, which involves testing the interoperability scenarios of base standards with DDI. The testing should incorporate use cases critical to the intent of the application or function for which the profile is being developed.
Develop conformance tests according to conformance tests created for the base standards, and provide a Profile Test Specification (PTS). The PTS can be included in the profile or as a separate document referencing the profile.
All of these should be reported in a profile, for those planning to use the profile for an implementation.
See Appendices B and C for several examples of profiles.
Profile formatting and storage
Profiles should be stored in a native, read-only format, for example, PDF, and stored in an repository. For example, a PDF copy of a DDI profile can be posted on the DDI Wiki, with permissions for removal and versioning granted to the DDI project manager and profile authors, and others with written delegated authority, in case of emergency.
Profile taxonomy
The taxonomy of DDI profiles follows the research data lifecycle. The topics in the taxonomy (in noun form) are:
- Specification / Conceptualization
- Collection
- Processing
- Dissemination
- Preservation
- Discovery
The taxonomy can be used to classify and name profiles created and registered for DDI.
Naming a profile
DDI profiles should be named according to a naming convention, to facilitate management and retrieval of profiles.
Example: Data Documentation Initiative Profile – Discovery – DDI and DISCO – McGill University Libraries - 2014
Registering a profile
Profiles should be registered with DDI to support information-sharing across the user base and to reduce duplication of work.
The benefits to registering a profile include:
- providing information about and promoting the use of a profile;
- providing authoritative versions of the profile;
- aligning profiles to other schemas, to support interoperability;
- indicating correct use of a namespace for use in XML, RDF, HTML and other environments.
Minimally the registration would include the title of the profile, the institution(s) responsible, author names, and date of the profile.
[EH: Where and how to register profiles?]
Governance
Profiles should be maintained to support common implementations of standards with DDI. In the context of a metadata registry, this can be accomplished through registry governance structures; however in cases in which a profile is jointly developed, one party should be responsible for managing the profile and handling inquiries. Profiles are updated based on feedback from users, based on full profile implementations.
Versioning should be handled by the profile authors, based on broader parameters defined by the DDI project manager.
References
- ISO/IEC TR 10000-1: Information technology – Framework and taxonomy of International Standardized Profiles
- Guidelines for Dublin Core Application Profiles
- Library of Congress: About profiles http://www.loc.gov/z3950/agency/profiles/about.html
- Government of Canada: Viewpoint to Developing a Metadata Application Profile – GCPEDIA. Last modified March 11, 2013. [Accessed June 2, 2014]
Appendix A: Standards by phase - Research data lifecycle
Specify | Collect | Process | Disseminate | Preserve | Discover |
---|---|---|---|---|---|
DCMI (Dublin Core) citations can be leveraged by a software agent to assess the research value of candidate constructs and measure | Protocol Specification is a functional view using GSIM and OWL-S together with DDI. This view is applicable to longitudinal studies as well as multi-mode contact procedures. | PROV can be used to tell a story about entities, activities, and people involved in producinga piece of data or thing over time | DCMI (Dublin Core) citations can point to publications that can assist researchers in the analysis of dissemination datasets | DISCO is one of several functional views. | |
GSIM is borrowed from in the DDI conceptual model and its views. |
| GSIM, OWL-S and DDI are combined in a functional view that supports the description of data processing pipelines. Provenance chains are a special case of a data processing pipeline. | DCAT (Data Catalog Vocabulary) annotates microdata datasets with their themes, publication information and access information. | Vis a vis DISCO, the NIH Data Discover Index is a more minimal approach to discovery | |
ADMS is a framework for identifying and annotating semantic assets such as code lists |
|
|
| PROV stories can figure into OAIS AIPs and DIPs | Vis a vis DISCO, DwB (Data Without Boundaries) is a more comprehensive approach to data discovery |
|
| Data Cube is used alongside DDI in the description of ncubes |
| OAIS-Data Lake is a specialization of the OAIS reference model that can take the form of metadata tags in a data lake. OAIS-Data Lake is a work in progress. |
|
Appendix B: Profile for DDI 4 + ?
Appendix B: Profile for DDI 4 + ?
Appendix C: Profile template
The structure for profiles, defined in Annex A (normative) of ISO/IEC TR 10000-1:1998(E) is as follows:
Section | Description |
Title page | Prepared using the documentation format defined for the DDI 4 project |
Contents | Optional. Provides an overall view of the profile and facilitates consultation of the document. Should normally list only the clauses and annexes in the profile. |
Foreword | Mandatory. Consists of information relating to the organization(s) responsible for the profile, as well as (optionally) a statement about whether the profile cancels or replaces other profiles or documents, a statement of major technical changes since the last version of the profile, and a statement on which parts of the profile are normative and which are informative. |
Introduction | Mandatory. Provides information about the process used to draft the profile and the degree of harmonization between the standards described in the profile. |
| Mandatory. |
| Mandatory. List of normative documents referenced in the profile, in bibliography form. Cannot include documents that are not publicly available. Note errors or corrections issued for the documents. |
| Optional. Provide definitions necessary for understanding the profile. Use the statement: "For the purposes of this DDI profile, the following definitions apply." |
| Optional. Provide a list of symbols and abbreviations, along with information needed to understand them. |
5. Requirements | Mandatory. |
6. Testing methods | Optional. Description of testing methods used to determine interoperability and conformance levels, as well as the conformance (ICS) and testing statements (PTS). |
7. Supplementary elements | Optional. Informative annexes, footnotes, and editorial and layout information. |
Glossary
Implementation Conformance Statement (ICS): A statement made by the supplier of an implementation or IT system claimed to conform to one or more specifications, stating which capabilities have been implemented, specifically including the relevant optional capabilities and limits. (Source: ISO/IEC TR 10000-1:1998(E))
Interoperability: The ability of two or more IT systems to exchange information and to make mutual use of the information that has been exchanged. (Source: ISO/IEC TR 10000-1:1998(E))
Profile: A set of one or more base standards and/or ISPs, and, where applicable, the identification of chosen classes, conforming subsets, options and parameters of those base standards, or ISPs necessary to accomplish a particular function. (Source: ISO/IEC TR 10000-1:1998(E))
Profile Test Specification (PTS): A statement describing the testing methods of the conformance tests carried out on the profile, as well as the test results and general conclusion.