Controlled Vocabularies

Notes on Controlled Vocabularies

2014-10-23 Joachim Wackerow (meeting  with Knut Wenzig, Justin Lynch, Sanda Ionescu)

Current Approach of DDI Controlled Vocabularies

  • Value of the Code
  • Descriptive Term of the Code
  • Definition of the Code

Hierarchy is expressed by a separator (“.”) in the code. Leftmost string is the highest level.

Both, term and definition can be multi-lingual.

The American English expression is the canonical version.

Current CVs in DDI 3

  • External controlled vocabularies
  • CodeList/CategoryScheme with nested codes for expressing hierarchy
  • Enumerated lists in XML Schema

Multiple Purposes

  • Code lists
  • Classification
  • Controlled vocabularies

Requirements

  • General
    • Ideally only one type of structure for multiple purposes
    • Usage of existing structures preferred if possible (like SKOS in the Semantic Web)
    • Same structure for internal representation in the model/specification and external representation. Reasoning: easy processing, same software solution, only different reference.
    • Validation of structure, keys, and possibly values
      • What should be validated: keys, values, relationship, dependency, requirement, …
      • XML Schema: what should be validated by XML parser, what in a secondary-level validation
  • Simple approach
    • Hierarchy
    • Multi-lingual text for term and definition
    • As simple as possible, easy to process
  • Complex
    • Requirement of items
    • Relationship of items
      • use case thesaurus
      • CV for keys, CV for values of key/value pair

Conclusion

Current sense: Two different approaches required: a simple CV structure similar to the approach for the current DDI Controlled Vocabularies and a more complex approach for additional requirements like validation and defining requirements.

Ideally the complex approach could make use of the simple approach.

  File Modified

Microsoft Powerpoint Presentation Controlled Vocabularies for DDI 4.pptx

Oct 28, 2014 by Wiki Editor (Unlicensed)