Qualitative data description: Approaches for handling unusual data and metadata Dealing with new data sources that can't be described currently in DDI - the current understanding at design time is either not good enough or totally agreed or it is unknown Two approaches 1) Simple controlled vocabulary - too flexible - doesn't allow you to describe what you really need - no data type - choices + it allows people to do what they want, highly flexible - doesn't communicate requirements in a structured way, no documentation 2) Use the DataRecord with the variables defined by DDI Similar to SDTM in CDISC - don't know if this is a description of a variable in a regular data set or if its the metause of a variable to describe a data source; overloads the meaning of variable + allows to use normal DDI infrastructure to describe variables + people are familier with the data record..its still a data record + harvestable for reuse in analysis, easy to merge with other data for analyis (see Larry's paper for NADDI2013) - envisioned as a means of capturing annotation of say an Excel file using color Excel annotations for addition data
3) Controlled schema language for describing data records - set of objects that provide a metamodel for describing data records (similar to SDMX reference metadata) Similar to SDTM in CDISC = Arofan thinks its great + powerful - complex - how does it interact with RDF and XML implementations + part of the standard (SDMX) is the binding http://sdmx.org/wp-content/uploads/2009/01/mwg-2007-5-1-sdmx-reference-metadata-support-v3.pdf
IN GENERAL: To have internally and externally a similar approach for controlled vocabularies Within DDI we should have a model for describing controlled vocabularies that can be used to describe the CV The reference would be to an internal or an external Retain the ability to use non-validatable external CVs but extend the ability to validate those that can/should be validated
Should an item be requried or not data type liniear structure (hierarchical, relational, etc.) USE Case from Citation: 1 - You have a pool of keys and you're going to create multiple data records from that pool, one for each type of annotation (i.e. a citation or a review of a question by OMB which contains a citation and other information) 2 - Excel spreadsheet where additional qualitative data is expressed as color, emphasis, Excel comments, links to other sheets, with comment |