Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Expand
titleFebruary 16, 2015

Simple Codebook Meeting
February 16, 2015

Present: Dan Gillman, Larry Hoyle, Steve McEachern, Mary Vardigan

Completeness of cross walk between 2 and 3

It The crosswalk or mapping is essentially one-way from 2 to 3. Codebook doesn't have the reusability that Lifecycle does. This is the same issue as between SPSS and Stata/SAS. We should look at the mapping in more detail.

Content and functionality of Simple Codebook

We want to make sure that Simple Codebook lets us write or ingest 2.x fairly seamlessly. Are the same kinds of element names available in 3? The names change even at the highest level.

Many miss the Tag Library as it was so simple. This kind of resource would be useful along with a mapping. However, Wendy advises that we don't have to worry about 2 since the mapping is there.

Even 2 has a lot of content. Are we still talking about a simple codebook as opposed to a complex codebook? Simple should allow you to take information from a major statistical package and move to another without losing any information (this is our definition of simple) . In terms of questionquestions, that they should be included as should sampling and universe. We should review DDI Lite and DDI Core, which have not been updated to the most recent versions of Codebook and Lifecycle. This may enable us to have a framework for content. We will deal with functionality later.

We make have been making the assumption that we have the instrument Instrument information and the data description Data Description information from those two views. What else do we need? Context We need context information or study level – Universe, sampling, design, bibliographic information. In DDI 2.* we have Citation, Study information , (which is discovery related), methodologyMethodology, and access. Does access below?Access. This is good content.

What do you need to know to use the data? You need the variable information. Question order and the way questions are asked may be important.

There is a tension between being very simple and following best practice for good documentation. Can we add pointers to relevant information? The simple/complex distinction is levels of detail.

For secondary users, we need enough information for a researcher to be able to understand and evaluate the quality of a dataset without reference back to the original data producer and . We also need enough information to pull it the data into a statistical package.

Take We started an exercise to take the common set of CESSDA, ICPSR, and IHSN mandatory schemas, and figure out what is the superset?Necessary. We will find this spreadsheet and compare this set of elements to what is in DDI Lite and DDI Core.

Necessary for a simple codebook: variables and questions and layout; universe or population; level of geography (basically coverage, including temporal and subject); sampling; or weights (and point to thorough description of sampling).

Distinction The distinction between simple and complex for data description is between a simple rectangular file and other data types; this applies to codebook in some ways as well. Is But there may be a cascading effect if we limit ourselves to simple rectangular files , we limit ourselves (we should describe hierarchical files as well like CPS). If we are describing the files themselves, you can describe qualitative files as objects with the existing DDI. You can have hierarchical data in CSV with a record type field but historically we have had files with physical representations of the data .that are esoteric. How much of this do we need to handle? For a simple codebook, the simple representation needs to should be limited to unicode or something like that.

Homework: review DDI Core: http://www.ddialliance.org/sites/default/files/ddi3/DDI3_CR3_Core.xml and DDI Lite: http://www.ddialliance.org/sites/default/files/ddi-lite.html

And think about what limitations we want to put on format to keep the idea of simple codebook but to keep it rich enough so we are covering enough situations.

The next meeting will be in two weeks on March 2.