Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Notes from October 23rd

Qualitative data description:

Approaches for handling unusual data and metadata
Dealing with new data sources that can't be described currently in DDI - the current understanding at design time is either not good enough or totally agreed or it is unknown

Two approaches

1) Simple controlled vocabulary
- too flexible
- doesn't allow you to describe what you really need
- no data type
- choices
+ it allows people to do what they want, highly flexible
- doesn't communicate requirements in a structured way, no documentation


2) Use the DataRecord with the variables defined by DDI
Similar to SDTM in CDISC
- don't know if this is a description of a variable in a regular data set or if its the metause of a variable to describe a data source; overloads the meaning of variable
+ allows to use normal DDI infrastructure to describe variables
+ people are familier with the data record..its still a data record
+ harvestable for reuse in analysis, easy to merge with other data for analyis (see Larry's paper for NADDI2013)
- envisioned as a means of capturing annotation of say an Excel file using color Excel annotations for addition data


3) Controlled schema language for describing data records - set of objects that provide a metamodel for describing data records (similar to SDMX reference metadata)
Similar to SDTM in CDISC
= Arofan thinks its great
+ powerful
- complex
- how does it interact with RDF and XML implementations
+ part of the standard (SDMX) is the binding http://sdmx.org/wp-content/uploads/2009/01/mwg-2007-5-1-sdmx-reference-metadata-support-v3.pdf


IN GENERAL:
To have internally and externally a similar approach for controlled vocabularies
Within DDI we should have a model for describing controlled vocabularies that can be used to describe the CV
The reference would be to an internal or an external
Retain the ability to use non-validatable external CVs but extend the ability to validate those that can/should be validated

 

Should an item be requried or not
data type
liniear structure (hierarchical, relational, etc.)

 

USE Case from Citation:

1 - You have a pool of keys and you're going to create multiple data records from that pool, one for each type of annotation (i.e. a citation or a review of a question by OMB which contains a citation and other information)

2 - Excel spreadsheet where additional qualitative data is expressed as color, emphasis, Excel comments, links to other sheets, with comment

 

 

Please share other files here:

...