Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
Info
iconfalse

 Simple Codebook View Team

...

Expand
title21 June 2016

Minutes of Simple Codebook group, Tuesday June 21, 2016

Attendees: Steve McEachern, Larry Hoyle, Oliver Hopt

Review of the output from Norway continued. The group focussed on how to resolve the outstanding fields identified in the DDI-C profile (from Larry's spreadsheetthe Google spreadsheet here:

https://docs.google.com/spreadsheets/d/1VDbVz2KRRSX_KEf0IfuE-QqMyTDupftCZfBdBM6VPT8/edit#gid=1652443366).

The proposal was for the remaining content to be addressed through three mechanisms:
- Referral to AG/Modelling group for "general approach" matters (such as Citation)
- Specific issues for the team to resolve (or to be addressed by related teams including DataDescription)
- Content that needed to be deferred due to dependencies on future activities of current working groups (Methodology and Physical Layout).

Details of each are below.

Discussions for modelling and/or advisory group:
1. Optional vs. mandatory content
2. Citations: text citations (e.g. Bibliographic Citation) vs. constructed/compiled/generated citation (from constituent parts)
- (also need to account for required citation text from data producers)
- Dublin Core: BibCit is one of the DCMI terms (but not the core 15 terms)
3. Access conditions:
- Whole datasets (DDI-C, DDI-L profiles)
- Variables within datasets (DDI-C, DDI-L profiles)
- Units within datasets
- Cases (records?) within datasets
- Metadata (e.g. Census RDC content restricts information on variable metadata)
AND
- What content is required within the access conditions (there was a model mentioned that may be a candidate)
- Variable Security and Variable Embargo (from DDI2.5)

SC team (or related teams) to resolve - Additional questions/fields outstanding:
1. Geographic Polygons

2. Variable metadata:
- VariableFiles: (files that contain this variable??) - probably covered by a DDI4 Relationship - recommend deferring this if needed (as may be part of future DataDescription model development)
- VariableInterval: continuous or discrete

3. ResponseUnit and AnalysisUnit (and Unit of Measurement)
- Consider a situation where the respondent to a survey is a Parent but the unit of interest is the Child - and the unit of analysis might be either the Child or the Household?? How do we describe these different "units"
- In particular, "AnalysisUnit" is problematic - because the unit of analysis is dependent on the research use - not on the data as captured.
- Might be related to Viewpoint??

Variable content characteristics may be best addressed now:

- VariableInterval,
- 3.2 dimensions such format, scale, decimalPositions, ... - numeric representation, classification level, ... (3.2 ties this more closely to the data type). Fundamentally these are attributes of the DATA TYPE and the MEASUREMENT
- Also consider the SummaryStatistics (DDI-C 2.5) in this discussion (note that this is probably more a characteristic of the "set of datums" rather than the InstanceVariable)
- Should be addressed by DataDescription to make a recommendation on when these attributes will be incorporated
- Larry Hoyle recommends including PRECISION as an attribute of the measurement
- What other content is commonly available from statistical packages. Reference Hoyle and Wackerow paper in IASSIST Quarterly. V39 N.3-4

Recommended for deferral
1. Methodology - all related fields
 

- All fields within Methodology section of DDI-C

- Also includes Imputation

 This requires output of Methodology team

2. PhysicalLayout
- MissingData
- VariableLocationStart-End-... (i.e. location in FixedWidthFiles)
This content requires the fixed width layout from the DataDescription group - which may not be included in the initial DD preview release.

 

...

Expand
titleFebruary 2, 2016

Codebook meeting

2 February 2016

Attending: Dan, Michelle, Steve, Oliver, Jon, Larry, Jared

There’s some lack of clarity about where this group is at.  Discussed what to include in simple codebooks.  One idea is to review the spreadsheet of common elements (summary of CESSDA) and build on that.  Essentials seem to include: enough information to read the data into statistical package, label values, understand universe, understand what measure means so you can interpret the data, attribution information.  Another idea is to look at examples of simple codebooks, identify what they use, and then map to a model.

We need to be careful to keep things simple.  Even older versions of DDI 2 weren’t exactly simple.

If we nail down definitions, then do we make instances of previous versions incompatible?  As we define what information elements we want in DDI 4.0, we can specify which element you want in 2 if you’re going backwards.  

Next steps:

  1. Michelle will go through spreadsheet and narrow down to those elements that are DDI Lite and any others that are heavily used (e.g., key words).

  2. Will paste those elements into new sheet within the spreadsheet.

...