Controlled Vocabularies
Notes on Controlled Vocabularies
2014-10-23 Joachim Wackerow (meeting with Knut Wenzig, Justin Lynch, Sanda Ionescu)
Current Approach of DDI Controlled Vocabularies
Value of the Code
Descriptive Term of the Code
Definition of the Code
Hierarchy is expressed by a separator (“.”) in the code. Leftmost string is the highest level.
Both, term and definition can be multi-lingual.
The American English expression is the canonical version.
Current CVs in DDI 3
External controlled vocabularies
CodeList/CategoryScheme with nested codes for expressing hierarchy
Enumerated lists in XML Schema
Multiple Purposes
Code lists
Classification
Controlled vocabularies
Requirements
General
Ideally only one type of structure for multiple purposes
Usage of existing structures preferred if possible (like SKOS in the Semantic Web)
Same structure for internal representation in the model/specification and external representation. Reasoning: easy processing, same software solution, only different reference.
Validation of structure, keys, and possibly values
What should be validated: keys, values, relationship, dependency, requirement, …
XML Schema: what should be validated by XML parser, what in a secondary-level validation
Simple approach
Hierarchy
Multi-lingual text for term and definition
As simple as possible, easy to process
Complex
Requirement of items
Relationship of items
use case thesaurus
CV for keys, CV for values of key/value pair
Conclusion
Current sense: Two different approaches required: a simple CV structure similar to the approach for the current DDI Controlled Vocabularies and a more complex approach for additional requirements like validation and defining requirements.
Ideally the complex approach could make use of the simple approach.