List of initial terms to keep (completed Feb 9, 2022)
updated with additions 2022-02-23
- in process of adding in new terms
put the mission statement at the top of the glossary
put a note in about going to another page to refer to softwares used to interpret DDI
therefore no softwares will be in the Glossary
we have to remember that the terms below are specifically relating to DDI
we will be referring to the original glossary and building on it
note - these descriptions are more descriptive
Archive (n.)
look in OAIS spec (https://en.wikipedia.org/wiki/Open_Archival_Information_System )
OAIS: Archive: An organization that intends to preserve information for access and use by a
Designated Community. (p. 1-9)ISO 14721, (2003, 2012)
any organization that maintains data for the long haul, including preservation (this is what it means for Lifecycle)
CASRAI
"A physical place or digital location containing curated static records and data. Set up and managed to established standards, (e.g. ISAD(G) https://www.ica.org/en/isadg-general-international-standard-archival-description-second-edition or Core Trust Seal https://www.coretrustseal.org/ that ensure long term integrity, security, authenticity and accessibility of the records and data"
draws on ICA (International Council on Archives) definition
DDI-L: A maintainable module containing information related to the archiving (long term access and/or preservation) of the data and metadata.
Codebook (for the purposes of DDI)
what is a codebook
codebook standard under DDI
DDI-C is an XML representation of a codebook
includes a lot of extra metadata about the study
questionnaire, how the study was conducted, …
conceptual variable
GSIM
CDI
DDI-L
Controlled vocabulary
how they get used in DDI
to configure the standard used
CASRAI: A list of standardized terminology, words, or phrases, used for indexing or content analysis and information retrieval, usually in a defined information domain.
Cross-Domain Integration
also known as CDI, DDI-CDI
a DDI product
a nascent standard in the DDI family
Correspondence table - see Crosswalk
GSIM
used to map similar things in collections
used in classifications specifically
Crosswalk - see Correspondence table
GSIM
Data Documentation Initiative
The Data Documentation Initiative (DDI) is a suite of open, human-readable, and machine-actionable specifications used internationally for describing the data produced with surveys and other observational methods in the social, behavioral, economic, and health domains
Data lifecycle
The stages of the data production and management process to support research and policy covering conceptualization, design, acquisition, processing, analysis, sharing, and archiving
Datum
A piece of information
[DDI-CDI definition]
[DDI Lifecycle definition]
DDI agency
DDI instance
1. The root element of a related set of DDI metadata in the DDI Lifecycle XML Schema
2. In general use, any XML instance containing DDI metadata
DDI Lifecycle Model
put a link in to the canonical model, that is the tech cttee version - have to ask them where it is!
DDI Lite
A simple subset of DDI Codebook elements with basic information describing a dataset. (See https://ddialliance.org/specification/ddi2.1/lite/index.html )
DDI Profile (i.e., Lite, etc.)
1. A selection of metadata fields conforming to the DDI Codebook, DDI Lifecycle, or DDI-CDI specification for use by a particular community or for a specific application
2. In DDI Lifecycle, a formal XML expression of the elements used by a particular community or application
[NOTE: Provide good examples]
DDI scheme
For DDI Lifecycle, a package of related metadata items of a single type (e.g., concepts, variables, categories) for the purposes of data/metadata management and reuse, owned and maintained by the DDI agency
dimensional data
Synonyms: multidimensional data, data cube, N-Cube
Data organized according to multiple axes which act as a coordinate system for identifying and describing individual datums. [?]
DISCO [this should be in the acronyms list also]
The DDI-RDF Discovery Vocabulary, which is a standard set of metadata generalized from DDI Codebook and DDI Lifecycle for supporting Web searches for data using the W3C linked data technologies. Note that DISCO is still under development.
Discovery
need a definition to be clear on DDI usage. Ability to uncover resources described using DDI metadata.? Identifying programmatically the relevant resources (datasets, studies?) for a specific research purpose. (from DDI-RDF vocabulary web page). ‘Find’ part of FAIR.
Dissemination
From Lifecycle. Focus on usage in DDI context. See paper, page 5 terminology with possible definition.
DTD
Acronym, document type definition. Document description language that largely predates XML schema. First version of Codebook was a DTD. Note that term has been superseded/is archaic. See DDI taxonomy page.
External reference – link to resource that is external to metadata instance (e.g. a vocabulary concept) (2022-01-26). needs thought…, review. need to clarify if is reference to DDI concept or external to DDI. both the DDI technical sense of reference to other DDI metadata, and a general sense of reference of non-DDI things
Genericode
http://docs.oasis-open.org/codelist/cs-genericode-1.0/doc/oasis-code-list-representation-genericode.html “OASIS Code List Representation format, “genericode”, is a single model and XML format (with a W3C XML Schema) that can encode a broad range of code list information.” Need to determine if DDI codelists use this encoding, if not, remove term.
Identifiable
used in Lifecycle, a class of things that have an identifier. Motivated by need to reference or reuse some information object.
IHSN toolkit
International household Survey Network application of DDI codebook. Software tools for working with application. Term might need to be updated for label currently used, perhaps “IHSN Microdata Management Toolkit”.
Inclusion inline vs. by reference need to look at how this is presented in specs, but need clarity on external reference, Internal publication of DDI schemes; glossary should have the same term (label) that is used in the specifications. [make positive statement to effect ‘ddi lifecycle XML uses references between instances and sources of metadata to enable reuse. Publication of DDI schemes supports this functionality’]
instance variable - variable in the context of a particular dataset; define with this approach-- ‘conceptual variable is…’, ‘represented variable is conceptual variable with…', ‘instance variable is a represented variable as used in a dataset…, denotes inclusion of information about source of data (context…). ’ Inherited from GSIM. appears in DDI lifecycle and and CDI. Most granular element in variable cascade.
instrument - implementable mechanism for collecting data. Notes - typically a questionnaire or sensor;
Interoperability - as defined in the FAIR principles (with a link to GoFair Principles - https://www.go-fair.org/fair-principles/ ). several aspects: data, instruments, semantics, system, syntax. The capacity for systems (things, agents) to interact meaningfully and correctly. [System is construed broadly to include any kind of interacting agent..]
key-value data (datastore, structure) - data in which each value is associated with an identifying field (string). Add note that identifying field (key) is unitary; key has no internal structure [continue discussion from here, stop 2022-05-04]
Lifecycle - see above (that is, Data lifecycle, DDI lifecycle)
Logical record - also look to see if GSIM has a logical record - if it does, need to mention it
long data
Machine-actionable
Maintainable (still used in DDI Lifecycle - like a database table of items that are maintained as a whole)
Major version
Minor version
Metadata - social science, behavioral definition
NADA cataloging tool - NADA is an open source microdata cataloging system, compliant with the Data Documentation Initiative (DDI) and Dublin Core’s RDF metadata standards. https://nada.ihsn.org/
N-Cubes - multi-dimensional data cubes used in DDI Codebook
Nesstar - even though it is a software, it is used for DDI
Physical record - physical recording of the values of the logical record
questionnaire
Register - a list
administrative data that holds info that can do research on the subjects, eg, tax, births and deaths
Registry - catalogue that can find data, eg, SDMX
ISO/IEC 11179
Repository - place where data and metadata holdings are maintained and distributed, eg., an archive
representative variable
Resource package - in DDI Lifecycle, a special construction, not for a specific dataset
check to make sure it is still used
SDTL - [this should be in the acronyms list also] an independent intermediate language for representing data transformation commands (from III. Purpose in https://ddi-alliance.atlassian.net/wiki/download/attachments/860815393/Part_1_DDI-CDI_Intro_PR_1.pdf )
this is a DDI product, https://ddialliance.org/products/sdtl/1.0
series – use definition in DDI Lifecycle
statistical classification – add note on clarification of ‘statistical’ vs. other kinds of classification. Classification should be exclusive and exhaustive. See usage in XKOS, CDI, Lifecycle, and GSIM.
study
survey
unit of measurement
unit type
universe
URI - in relation to DDI
URL - in relation to DDI
URN - in relation to DDI
variable cascade
variable types (in context with Lifecycle, Codebook, …)
Versionable
Versioning - a technical specification in DDI
talk about the different specs
wide data
XKOS - this should be in the acronyms list also
this is a DDI product
XML Schema
Possible Terms to add
Acronyms
CDI
DDI
DISCO
DTD
ESS -- European Social Survey -should be in list of acronyms. not really a concept (2022-01-26)
do we really need this - how is it related to DDI?
IHSN
SDTL
XKOS