Minutes of Simple Codebook group, Tuesday June 21, 2016 Attendees: Steve McEachern, Larry Hoyle, Oliver Hopt Review of the output from Norway continued. The group focussed on how to resolve the outstanding fields identified in the DDI-C profile (from Larry's spreadsheetthe Google spreadsheet here: https://docs.google.com/spreadsheets/d/1VDbVz2KRRSX_KEf0IfuE-QqMyTDupftCZfBdBM6VPT8/edit#gid=1652443366). The proposal was for the remaining content to be addressed through three mechanisms: - Referral to AG/Modelling group for "general approach" matters (such as Citation) - Specific issues for the team to resolve (or to be addressed by related teams including DataDescription) - Content that needed to be deferred due to dependencies on future activities of current working groups (Methodology and Physical Layout).
Details of each are below.
Discussions for modelling and/or advisory group: 1. Optional vs. mandatory content 2. Citations: text citations (e.g. Bibliographic Citation) vs. constructed/compiled/generated citation (from constituent parts) - (also need to account for required citation text from data producers) - Dublin Core: BibCit is one of the DCMI terms (but not the core 15 terms) 3. Access conditions: - Whole datasets (DDI-C, DDI-L profiles) - Variables within datasets (DDI-C, DDI-L profiles) - Units within datasets - Cases (records?) within datasets - Metadata (e.g. Census RDC content restricts information on variable metadata) AND - What content is required within the access conditions (there was a model mentioned that may be a candidate) - Variable Security and Variable Embargo (from DDI2.5)
SC team (or related teams) to resolve - Additional questions/fields outstanding: 1. Geographic Polygons 2. Variable metadata: - VariableFiles: (files that contain this variable??) - probably covered by a DDI4 Relationship - recommend deferring this if needed (as may be part of future DataDescription model development) - VariableInterval: continuous or discrete 3. ResponseUnit and AnalysisUnit (and Unit of Measurement) - Consider a situation where the respondent to a survey is a Parent but the unit of interest is the Child - and the unit of analysis might be either the Child or the Household?? How do we describe these different "units" - In particular, "AnalysisUnit" is problematic - because the unit of analysis is dependent on the research use - not on the data as captured. - Might be related to Viewpoint?? Variable content characteristics may be best addressed now: - VariableInterval, - 3.2 dimensions such format, scale, decimalPositions, ... - numeric representation, classification level, ... (3.2 ties this more closely to the data type). Fundamentally these are attributes of the DATA TYPE and the MEASUREMENT - Also consider the SummaryStatistics (DDI-C 2.5) in this discussion (note that this is probably more a characteristic of the "set of datums" rather than the InstanceVariable) - Should be addressed by DataDescription to make a recommendation on when these attributes will be incorporated - Larry Hoyle recommends including PRECISION as an attribute of the measurement - What other content is commonly available from statistical packages. Reference Hoyle and Wackerow paper in IASSIST Quarterly. V39 N.3-4 Recommended for deferral 1. Methodology - all related fields - All fields within Methodology section of DDI-C - Also includes Imputation This requires output of Methodology team
2. PhysicalLayout - MissingData - VariableLocationStart-End-... (i.e. location in FixedWidthFiles) This content requires the fixed width layout from the DataDescription group - which may not be included in the initial DD preview release. |