Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

Expand
titleJune 8, 2015

Simple Codebook
June 8, 2015

Present: Dan Gillman, Larry Hoyle, Jenny Linnerud, Oliver Hopt, Mary Vardigan

The group continued to review the spreadsheet mapping DDI 2.* to DDI4 and noting items that the modeling should take up.

Then the group turned to the metadata that the statistical packages include. Larry provided a spreadsheet that he and Achim had developed to show which metadata were included in each of the major statistical packages. It will be important for Codebook to contain all of this metadata. There are other ways of handling data, like SQL, that might also be appropriate. In the Big Data world, Python is becoming popular. Python  is a general scripting language and has replaced the role that PERL had at one point. You can explicitly represent trees like JSON and XML, so it is very flexible. People have developed modules that do statistical kinds of things with Python.

Looking at all the software metadata from the statistical point of view is important. We need to make sure that everything in Larry's spreadsheet is accounted for in a meaningful way. We need to identify things that are not in the DDI 2.* spreadsheet. We can go through this all together or do assignments.

Number of significant digits is important in some scientific data. Whether the number has been rounded can be important. This should be included in DDI4. In 11179 community, there was a discussion of accuracy and precision. This is related to significant digits. The Data Description Team should address this. In an Instance Variable we may want to talk about significant digits while for a Represented Variable we talk about accuracy.

Larry and Dan will talk with the Data Description and Modeling teams about these issues.

 

 

Expand
titleJune 22 2015

Simple Codebook Meeting
June 22, 2015

Present: Dan Gillman, Mary Vardigan