Analysis Part 1

Name (Working term)Area of coverageISO 19115DDIDCATW3C QB (SDMX)SSN/SOSA

Spatial and geographic

Information required for describing geographic information and services (e.g. ISO 19115)

Good - EX_ExtentGood- defines geographic area types and specific locations with detailed relationships. Provides information needed by graphical systems to link to appropriate footprintsYes - dct:spatialN/A - although the Dimensions defined in many instances will be spatial.N/A (recommends to use Basic Geo SpatialThing, or GeoJSON and GeoSPARQL spatial relations)

Temporal

Information required for describing time-based characteristics of the data (e.g. the date of publication, a time stamp)

Good - EX_ExtentVery good - publication information, data capture, reference, and geographic date. Developing use of Allen's Intervals in processingYes - dct:temporalDittoYes, distinguishes phenomenonTime and resultTime of an Observation/Actuation/Sampling. For temporal relations between qualified Timepoints the use of OWL Time is recommended.

Contributors

People, organisations, agents, ...

Good - CI_ResponsiblePartyGood - agents (organization, individual, machine) managed descriptions and relationships. In relationship to specific context (i.e. role in the processing of data) can define role and degree of involvement linked to an agentYes - dct:contributor, dct:creator, dct:publisher, prov:wasAttributedToThe spec recommends use of DC to provide basic dataset-level metadata.N/A (recommends the use of additional RDF vocabularies/namespaces, i.e. DC Terms)

Process

Description of processes, workflows, transformations, ...

OK -Basic process model (computer  program flow) available for data capture process flow (i.e. questionnaire), data processing (i.e. derivations, imputation, validation, etc.)

Links to PROV, doesn't handle it directly
i.e. :wasGeneratedBy


DittoTop level class for act-of-observation, Procedure. Details delegated to applications. OWL-S can be used for IOPE (Input, Output, Preconditions and Effects) modelling. However, a widely used RDF-based Web process ontology is missing.

Provenance

(Related to process) Descriptions of the process used to create/produce/transform/publish data

Good - LI_Lineage & ISO 19115-2 extensions


Data (variable level) can link to its source (instruction, derivation,  question, measure, etc.) providing provenance for the variable. Developing to include provenance down to the datum (case level response for variable)Links to PROV, doesn't handle it directlyDittoAlignment with PROV-O, i.e. Observations, Actuations and Sampling are PROV Activities. Other recommended vocabularies, DCAT and VANN)

Vocabularies / lists / classifications

Enumerated lists of terms that may be applied to content being described.

Vocabularies well defined, but poorly publishedSupports informal and formal classifications including statistical, geographic, and topical. Supports the use of externally managed vocabularies citing source and value throughout modelSeveral properties have a range of skos:Concept or typed values, so it promotes the use of enumerated listsRelies on skos:Concept to classify observation types etc.N/A. Relies on skos:Concept to classify observations/samplings/actuations, procedures, sensors/actuators, Features of Interest or Properties.

Resources

Objects being described or referenced. May include datasets, but also publications, software, code, other metadata, ...

In principle yes through MD_ScopeCode;
in practice used primarily for datasets
Incorporates the ability to link external resources in any format to the intellectual content of the element. DCAT can describe any digital asset, including the traditional meaning of the word 'document' (textual data), images, multi-mediaN/A

Individual observation/results (database cells). Recommends RDF DataCube vocabulary for complex results.

Also sensors, platforms

Datasets

Specific descriptions of datasets as primary objects


Description at both the logical and physical level. Description of storage formats continues to expand. That's the explicit aim, yes.No, QB is used to structure datasets, i.e. an instance of a QB is a dataset. Again, basic metadata should be provided using DC, including classification, with more detail via specialist vocabularies: DCAT, VoID etc.)Only individual results (database cells). Recommends DCAT for dataset references.

Observation / Capture

Classes/objects that describe the processes by which data is created, generated, captured, transformed. (QUESTION: Is this the same as Provenance/Process?)


Questionnaires and questions are well defined for management over time, organizing into a flow and providing instructions, and generating instruments in multiple modes. Products exist to generate and administer questionnaires from the metadata. Measurements (non-question) have been recently added. Secondary use of data can be sourced.Yes, this is the same as provenance. See above.As aboveProcedure (a schedule of PROV-O Activities), Sensor/Actuator/Sampler to describe the devices that capture the data.

Data

The logical structure of the data being described - variables, units of measurement, concepts, sample units and populations, records, datum(s), cells. (May or may not be a subset of datasets)


Comprehensive metadata on the logical content of the data down to the case and variable. Documentation is both descriptive and machine actionable for automated processing (e.g. scripts for statistical packages can be generated from the metadata). This covers microdata (unit) and structured (dimensional data) down to the logical description of the a cell. Developing coverage down to the datum.    

DCAT indicates format (serialization) and may also point to a standard to which a dataset conforms (... schema, structure...), but does not provide structural metadata itself

dct:conformsTo

Yes. This is explicitly what the RDF Data Cube does.XSD Schema Datatypes for Observation/Sampling/Actuation activities. For Units of Measurement other vocabularies are recommended (e.g. QUDT and OM)

Storage

The physical representation of the data (files, formats, locations, ...)



Yep.

dcat:mediaType

dcat:accessURL

RDF in any serialisation (check the MIME type)N/A. Recommends the use of DC Terms and DCAT.

Access

Who can access the data and how


Access requirements and restrictions (including a citation requirement and confidentiality) are available for the Study or data set level across DDI. Access information can be associated with variables in DDI 3 and beyond. Development is being done to provide additional specification on access restrictions imposed on the data in a consistent manner and restrictions imposed by archives/repositories as well item level annotation information. 

Currently published DDI also allows for use of Dublin Core Access fields at the level of the study/dataset.

dcterms:license  dcterms:rights  odrl:hasPolicyAgain, internal DC for basics, external for detail.N/A. Recommends the use of DC Terms.

Administrative / core /...

Foundational classes for use in building the specification - e.g. identification, versioning, primitives



Based heavily on Dublin Core. DCAT defines very few classes of its own and imposes minimal semantics.Explicitly an RDF expression of SDMXN/A. Recommends the use of DC Terms and PROV-O.
Extension

DDI 3 supports a structured means of extending content in specific ways (local identifiers, local key/value pairs). 

DDI4 allows for the specification of a reusable vocabulary for keys in key-value pairs.

Add terms from any RDF vocabularies Yep - it's RDF. See QB4ST for exampleAdd terms from any RDF vocabularies
Web compatibility?Is it easy to use in a web environment? URLs, etcNot so good
Full. See also schema:DatasetFullFull.