Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Name (Working term)Area of coverageDemographic Data (Gridded Population of the World (GPW), v4IDDO CDISC Ebola DataWHO Ebola Mortality Data Sets (Basic Download; XML)

Spatial and geographic

Information required for describing geographic information and services (e.g. ISO 19115)

Available in an xml format (FGDC) that provides bounding boxes.  Granularity is an issue.Clinical data collected from different organisations; across a number of hospitals.  This is an assumption.  CDISC data dictionary seems to only deal with country.  Location of the clinic is important information, which is also potentially disclosive.  Probably is in the data set but not clear where.National level.The csv and json downloads have location information; when hacking the query to create xml the locations were not included: 

Temporal

Information required for describing time-based characteristics of the data (e.g. the date of publication, a time stamp)

Changes over time, but slowly.  Periodic updates.  Good enough for basic model.Dates of a number of variables, dates at which particular observations happened.  Temporal information is recorded in the data dictionary and pertains to a number of the events.  Deals with observation time and event time.  Has spell information.

Contributors

People, organisations, agents, ...

Data aggregator is known.  Have detailed, manual description of data sources.Organisational identifiers?  Hospitals, clinics, temporary treatment units?  To what extent was this collected and is it contained in the data model.  There is a field for evaluator, which gives some indication.  Identification risks, but may not have been collected.

Process

Description of processes, workflows, transformations, ...

Importance of this depends on the research question.  This dataset has detailed human readable account of the process and provenance of the data product.Document that documents the process that went into compiling the data.  Internal documentation.

Provenance

(Related to process) Descriptions of the process used to create/produce/transform/publish data

Covered in cell above.

Vocabularies / lists / classifications

Enumerated lists of terms that may be applied to content being described.

No obvious incompatibilities; can be worked around because of data dictionary.Uses standard CDISC domains.  Customisations can be rolled back into the CDISC ontology.  CDISC share references a number of vocabularies.

Resources

Objects being described or referenced. May include datasets, but also publications, software, code, other metadata, ...

Not applicable for use case.Varies from organisational source.  Often a relatively raw data dump.  Compiled from pdf forms - these can be referenced.

Datasets

Specific descriptions of datasets as primary objects

Yes.Jay will explore this.

Observation / Capture

Classes/objects that describe the processes by which data is created, generated, captured, transformed. (QUESTION: Is this the same as Provenance/Process?)


See above, process description.  Dictionary has information about the devices used to make measurements.

Data

The logical structure of the data being described - variables, units of measurement, concepts, sample units and populations, records, datum(s), cells. (May or may not be a subset of datasets)

Aggregate data set.  Estimate of gridded population against time.  Dimensionality to be identified.Detailed data dictionary.  Separate standard that relates to SDTM standards called ADAM.  Is IDDO using this?  

Storage

The physical representation of the data (files, formats, locations, ...)

Netcdf, GEOTiff, ASCIIPart of the CDISC package.  Typically xml.  CDISC standards has tools.  Integrated with clinical systems.  Generated directly by clinical systems.

Access

Who can access the data and how

"You are required to login to download data or maps. Click "LOGIN" to proceed to log in or to register. If you click "CANCEL", you may browse the page but you will still be required to login to download data or maps."

Is there an API to access these data?  Or is it just be selection and download?

Data access committee.  External community can apply for access.  Data providers can determine to what extent they wish data to be made available.  What criteria are used? How does the data gatekeeper manage this? What restrictions are imposed?  Is access information documented?

Administrative / core / ...

Foundational classes for use in building the specification - e.g. identification, versioning, primitives


CDISC standard.
Web compatibility?Is it easy to use in a web environment? URLs, etcRequires download.Possibly in so far as CDISC is, but IDDO restricts access.
Is the resource maintained and supported?
Yes.Curated and maintained by IDDO.
Updating?Dynamic or batch?Periodic, every 3-4 years.Periodic according to the extent that different organisations provide the data.
Capacity for extensions?

Over the past few years CDISC has an ontology (CDISC Share) to connect data in different CDISC domains.  There is a standard resource to help. 

...