Codebook Working Page

DDI 4 xml example - ICPSR Study 8344

Documenting DDI 4 Codebook View like ODM

-ODM Example

World Bank DDI 4 xml example

Documentation Recommendations

Clear guidance on where to start - DDI element, root

Clearly define what abstract classes mean

Define what attributes, properties, and relationships mean in the context of the model. This should be spelled out for the layperson with specific examples.

Annotation will also require guidance and when and where to use.

Specific guidance on creating IDs, by hand.

Clarification of what CollectonTypes mean, bags and sets.

Clear explanation of how to record what used to be part of the methodology section in DDI 2, such as sampling procedures, weighting, etc. The added flexibility of doing this in 4 means it needs more guidance.

The relationships between instance, represented, and conceptual for both variables and questions needs to be made VERY clear.

Also, clear language around DataStore and the PhyscialLayout options.

Locators are buried quite deep. Clear language needed.

 Monday Notes: Click here to expand...

General:

Has a map in mind from 2 and 3. 

"This is even more complicated than 3.0"

The issues with cross-referencing and complication. See Thursday notes.

Specific questions: Where to start? What is the root element? Need a wrapper.

What is the attribute type of DDI element - look at attributes (manual process different)

Chose DDI, gives you documentionformation - fixed

Clear instruction on where to use Annotation and what to put in it (documentinformation/study, etc.). Describing the instance vs the study.

STUDY:

Under hasAnnotation:

Rights or Copyright? some more clarification

Content vs LanguageSpecificString (Managing Agency and Agent Name) Is this consistent? - See Thursday notes

Date: Did we do this right?

Where to put language? xs:language? - See Thursday notes

RecordCreationDate and Revision? Use in document description or study level?

FundingInformation:

Clarified what was abstract and what wasn't.

type of class - look at the attributes

Funding Agent or Organization?

Agents? Members?

hasAFunder indicates that should be an agent, could not insert agent, type can only be organization or individual or machine

Clarifying how relationships work and the URN

StudySeries

type - collectionType?

hasDesign type of class attribute?

Design linked via hasDesign, but DesignOverview on the view list

hasProcess - what happened when? Use of process steps?

 Tuesday Notes: Click here to expand...

Clarification of language around object properties (sub-elements) and attributes

Clarification of language around coverage and population....maybe, don't know yet.

Coverage:

Clarification of Spatial Object (geometric) and spatialAreaCode (controlled vocabulary) - in DDI2 we have smallest geographical unit, as humans understand them. In the definition it mentions street (county, state, block, street, etc.) In 4 this is described in geometric language. Is the spatialAreaCode supposed to replace that? Or are we losing that human-readable conceptual information? - Discussed. They will add the ability to create the human-readable info.

SpatialObject value question - polygon value was not validated in the schema. Probably needs to be a value from an enumeration. Possible enumeration is not incorporated in schema. -**Use Polygon, not POLYGON

coverageDate links to referenceDate which inherits typeOfDate from AnnotationDate ? General question about why different classes if they have same type (Date, annotationDate, ReferenceDate).

SamplingProcedure:

Please tell us more about Name

The relationships with design, samplingprocess, are all new to 4.

What is component methodology? Why do we need it? See Thursday notes.

SamplingAlgorithm and Algorithm (from Study), what is their purpose?

MethodologyOverview:

It's a section in 2, no content - discussed this as a feature in 4

For a user, what is the difference between methodologyOverview and Methodology. They seem to have different relationships. But Methodology does not appear as a field in the view. - Methodology is an abstract.

Type of class for componentMethodology not specified in the schema - This is on purpose to make it as flexible as possible.

Variables:

Duplicates of everything in Represented and Instance variables.

What is the difference between conceptual and represented variable. If Instance V is linked to Represented V, then not to Conceptual V. That implies that Instance V can be linked to conceptual variable. - discussed, but can't find the notes...

Afternoon post-explanation:

This is way too complicated. -K

CodeItem value allows for code and datum, no designation or value

 Wednesday Notes: Click here to expand...

Continuing Variable:

Question: IDs in 4 are plain text, does this make it more difficult for software to parse, e.g. Codelist contains code, code denotes category. Class name has same type of content.

Code and codeItem: why is this necessary when a code item can only contain one code? Redundant? Code without codeItem? - discussed. See Thursday notes.

Time to discuss IDs! The difficulty of programming to all of the cross-references. External references would not be as complicated, would be to another object and then parse that object. - discussed. See Thursday notes.

How to group variables? - on the task list

Range and valid range - do it in the ValueandConceptDescription , minimum value inclusive, exclusive, etc.

Frequency

How to input frequencies? - not available - on task list.

Question

InstanceQuestion requires RepresentedQuestion, which is not in the codebook view. - fixed

Able to create a represented question when using the whole DDI schema - see task list

Layout

We cannot instantiate a physical layout in the codebook view. - Need to use a specific type, so rectangularLayout or event date

No location attributes in DD4 - locaters have been added as valuemappings in rectangularLayout

 Thursday Notes: Click here to expand...

Mehmood Session:

He started with Larry's data description of a csv example. He will take study description information from Sanda's work.

Goal is to have the same elements that the World Bank uses.

Worked in DataStore for the record counts, missing values, version rationale, etc.

He couldn't find where to put processing checks.

The name of data file should be in dataStore (logical), but is missing one for the physical IMPORTANT

World bank thinks of variables as part of a data file, don't think of them as outside the file - in 4 he defined as instance variables (with no link) then did logicalRecordLayout that links the variables to the data file.

Rectangular layout contains ValueMappings, helpful for the csv that they are generating.

Summary statistics are missing - World Bank uses a lot - lost these when PhysicalInstance went away.

Will go home and finish and leave notes in the xml, and fill in from Sanda's example.

More work on filling in all of study description fields in 4 should be done.

Mehmood to send spreadsheet of World Bank 'profile' from nesstar to Sanda to test further

SANDA session -

Methodology - Use subjectofMethodology to describe different pieces, weighting, sampling, etc. Problem is how do you control who uses what controlled vocabulary. The advantage of 2 is that these fields are set and clear. Could you hard-wire the external controlled vocabulary into the software tool?

File issue - clean up use of content vs. languagespecificstring - need to identify all points where attributes point to complexdatatypes

Being able to lock it down, constraining the extensibility somehow? Using the World Bank as a use case.

xs:language should be different datatype in annotation, should be complex thing that can be xs:language and string.

ID talk - Cross referencing in 3 was tree referenced in xml , the UML generating code from references will be easier.

How do you decide what winds up in Study and what doesn't such as Funding? Basically anything that needs to be reused, it can still exist on its own.

In the data description csv example that Larry created there are no variables, only the record layout.

 Friday Notes: Click here to expand...

Sanda's notes from her work Friday morning.

-Used the new updated Build.

-The issue with the reference to Conceptual Instrument only (when we needed to refer to the Implemented Instrument) has been fixed.
We can now both refer to, and describe an Implemented Instrument.
The conceptual instrument is still present as a class in the Cdbk view. I am not sure if this was meant, but have no problem with this.

-I was able to input codes ! I did have a question about these however.
In the documentation, Token (the actual element for codes, it appears) is listed as a property of CodeValue.
I did not, however, have the possibility to enter a CodeValue element in the DDI (it is not included in the schema (?))
Instead, the markup runs like this:<Code><Representation><Token><Content>HERE I ENTERED THE CODE FROM THE DATA.
You might want to double check with Wendy on this, or let me know if I should do it.

- I looked at File Description (i.e. Data Store) it lacks a lot of the DDI-C elements - case quantity, variable quantity, logical record length, records per case...
Not sure what's going on. Is this information elsewhere in the DDI 4? Also, data checks are not enabled at this level. But perhaps they should be described elsewhere (?)

-this is minor but if fixed may be of great help for implementers (?) sometimes the LanguageSpecificString is typed out by Oxygen when opening the element where it is needed.
Sometimes not. (i.e. Access Conditions) it would be nice if this happened consistently so that the user does not have to figure it out by him/herself. Also, could this be done with Content too? I don't know, but it would be helpful.

-I looked at Access. It is very similar to what we have in DDI-C, therefore no issues.

General Questions

Code is redundant when you already have a codeItem.

coverageDate links to referenceDate which inherits typeOfDate from AnnotationDate ? General question about why different classes if they have same type (Date, annotationDate, ReferenceDate).

SamplingAlgorithm and Algorithm (from Study), how to use?

Could you hard-wire the external controlled vocabulary into a software tool? In general thinking about a way to control the extensibility and using the World Bank as a use case.



Specific Issues

General

  • File issue - clean up use of content vs. languagespecificstring - need to identify all points where attributes point to complexdatatypes DMT-132
  • xs:language should be different datatype in annotation, should be complex thing that can be xs:language and string. DMT-133

  • ConceptualInstrument is not in Codebook View. See item in Study (below) (ADDED)

Study

  • Design linked via hasDesign, but DesignOverview on the view list CWG-9
  • hasAFunder indicates that should be an agent, could not insert agent, type can only be organization or individual or machine ADDED documentation
  • hasInstrument only allows for ConceptualInstrument in the attributes, not ImplementedInstrument. ADDED ConceptualInstrument to FV link is from ImplementedInstrument to ConceptualCodebook

Variables

  • Need value for code DMT-128
  • How to group variables? DMT-127
  • Range and valid range FOUND

Frequency

  • How to input frequencies? DDI4DATA-18

Question

  • InstanceQuestion requires RepresentedQuestion, which is not in the codebook view. DMT-126
  • Able to create a represented question when using the whole DDI schema DMT-126

Layout

  • We cannot instantiate a physical layout in the codebook view. FOUND
  • No location attributes in DD4 DMT-128
  • The name of data file should be in dataStore (logical), but is missing one for the physical IMPORTANT DDI4DATA-19

Coverage

  • in SpatialObject, we need a place to put human-readable lowest level of coverage. e.g. street, county, state, etc. DMT-134

  File Modified

HTML File ODM1-3-0-Final.htm

Jun 01, 2017 by Kelly A Chatain (Unlicensed)

XML File update 3 retirement study.xml

Jun 05, 2017 by Kelly A Chatain (Unlicensed)

Text File Documentation recommendations for DDI4.txt

Jun 05, 2017 by Kelly A Chatain (Unlicensed)

XML File ddi4-IHSN.xml

Jun 05, 2017 by Kelly A Chatain (Unlicensed)