Data Collection and Processing

Formerly known as Data Production - now split into Data Collection and Processing (Proposed View) and Data Analysis and Dissemination (Suggested View)

 

PurposeTo describe the activities and outputs derived from the implementation of methodology.
Description of view

This view describes the various activities that constitute the implementation of data collection and data processing according to the specification set out in the methodology/study design.

 

 

View statusPROPOSED
Proposed by

Moving Forward Project

The activities included in this view are those that are required to deliver a final data product that is suitable for analysis. Out of scope are those activities involved in analysis and dissemination (GSBPM steps 6 & 7) - this is the subject of a separate suggested views.

This includes (in general terms) the following stages in the GSBPM

 

Topics covered

Likely topics for this view include:

Collection:

    • Sample selection
    • Sample management
    • Data collection process
    • Paradata
    • Field work management

Processing:

    • Weighting process
    • Imputation process
    • Data cleaning process
    • Coding process
    • (etc.) 

A more detailed overview of the expected content of this view is specified in the GLBPM Longitudinal Data project user story developed by Barry Radler and Jon Johnson (See Source information below).

 

Questions/Issues

This view will instantiate the methodology specified in the Methodology, particularly the study design. Therefore it will need to parallel the procedures outlined in that Methodology.

Similarly, there will other views that this view will draw heavily upon, but that are separate to this. In particular:

  • Sample
  • (Simple) Instrument
  • (Simple) Data Description
  • Process

Discussion among Steve, Dan, Jay, Barry about a DDI 'cookbook' which would describe how to use DDI (what fields/elements) to document specific data collection and processing situations (simple survey, cognitive assessment, tissue collection, constructed or transformed variables, etc.). This addresses the problem of multiple approaches to ostensibly similar data production processes, which result in non-interoperability among instances.

 

 

 

Sources:

GLBPM Longitudinal user story: https://docs.google.com/spreadsheet/ccc?key=0AmjuyyBzwvcodE9Qam5HbFhJRTV6U19rbnY5WUVZZEE&usp=drive_web#gid=3