Qualitative Working Group
Notes from October 23rd
Qualitative data description:
Approaches for handling unusual data and metadata
Dealing with new data sources that can't be described currently in DDI - the current understanding at design time is either not good enough or totally agreed or it is unknown
Two approaches
1) Simple controlled vocabulary
- too flexible
- doesn't allow you to describe what you really need
- no data type
- choices
+ it allows people to do what they want, highly flexible
- doesn't communicate requirements in a structured way, no documentation
2) Use the DataRecord with the variables defined by DDI
Similar to SDTM in CDISC
- don't know if this is a description of a variable in a regular data set or if its the metause of a variable to describe a data source; overloads the meaning of variable
+ allows to use normal DDI infrastructure to describe variables
+ people are familier with the data record..its still a data record
+ harvestable for reuse in analysis, easy to merge with other data for analyis (see Larry's paper for NADDI2013)
- envisioned as a means of capturing annotation of say an Excel file using color Excel annotations for addition data
3) Controlled schema language for describing data records - set of objects that provide a metamodel for describing data records (similar to SDMX reference metadata)
Similar to SDTM in CDISC
= Arofan thinks its great
+ powerful
- complex
- how does it interact with RDF and XML implementations
+ part of the standard (SDMX) is the binding http://sdmx.org/wp-content/uploads/2009/01/mwg-2007-5-1-sdmx-reference-metadata-support-v3.pdf
IN GENERAL:
To have internally and externally a similar approach for controlled vocabularies
Within DDI we should have a model for describing controlled vocabularies that can be used to describe the CV
The reference would be to an internal or an external
Retain the ability to use non-validatable external CVs but extend the ability to validate those that can/should be validated
Should an item be requried or not
data type
liniear structure (hierarchical, relational, etc.)
USE Case from Citation:
1 - You have a pool of keys and you're going to create multiple data records from that pool, one for each type of annotation (i.e. a citation or a review of a question by OMB which contains a citation and other information)
2 - Excel spreadsheet where additional qualitative data is expressed as color, emphasis, Excel comments, links to other sheets, with comment
Please share other files here: