Glossary work

List of initial terms to keep (completed Feb 9, 2022)

updated with additions 2022-02-23

- in process of adding in new terms

  • put the mission statement at the top of the glossary

  • put a note in about going to another page to refer to softwares used to interpret DDI

    • therefore no softwares will be in the Glossary

  • we have to remember that the terms below are specifically relating to DDI

  • we will be referring to the original glossary and building on it

    • note - these descriptions are more descriptive

 

  • Archive (n.)

  • Codebook (for the purposes of DDI)

    • what is a codebook

    • codebook standard under DDI

    • DDI-C is an XML representation of a codebook

      • includes a lot of extra metadata about the study

        • questionnaire, how the study was conducted, …

  • conceptual variable

    • GSIM

    • CDI

    • DDI-L

  • Controlled vocabulary

    • how they get used in DDI

      • to configure the standard used

    • CASRAI: A list of standardized terminology, words, or phrases, used for indexing or content analysis and information retrieval, usually in a defined information domain.

  • Cross-Domain Integration

  • Correspondence table - see Crosswalk

    • GSIM

    • used to map similar things in collections

    • used in classifications specifically

  • Crosswalk - see Correspondence table

    • GSIM

  • Data Documentation Initiative

    • The Data Documentation Initiative (DDI) is a suite of open, human-readable, and machine-actionable specifications used internationally for describing the data produced with surveys and other observational methods in the social, behavioral, economic, and health domains

  • Data lifecycle

    • The stages of the data production and management process to support research and policy covering conceptualization, design, acquisition, processing, analysis, sharing, and archiving

  • Datum

    • A piece of information

    • [DDI-CDI definition]

    • [DDI Lifecycle definition]

  • DDI agency

  • DDI instance

1. The root element of a related set of DDI metadata in the DDI Lifecycle XML Schema

2. In general use, any XML instance containing DDI metadata

  • DDI Lifecycle Model

    • put a link in to the canonical model, that is the tech cttee version - have to ask them where it is!

  • DDI Lite

  • DDI Profile (i.e., Lite, etc.)

1. A selection of metadata fields conforming to the DDI Codebook, DDI Lifecycle, or DDI-CDI specification for use by a particular community or for a specific application

2. In DDI Lifecycle, a formal XML expression of the elements used by a particular community or application

[NOTE: Provide good examples]

  • DDI scheme

    • For DDI Lifecycle, a package of related metadata items of a single type (e.g., concepts, variables, categories) for the purposes of data/metadata management and reuse, owned and maintained by the DDI agency

  • dimensional data

    • Synonyms: multidimensional data, data cube, N-Cube

    • Data organized according to multiple axes which act as a coordinate system for identifying and describing individual datums. [?]

  • DISCO [this should be in the acronyms list also]

    • The DDI-RDF Discovery Vocabulary, which is a standard set of metadata generalized from DDI Codebook and DDI Lifecycle for supporting Web searches for data using the W3C linked data technologies. Note that DISCO is still under development.

  • Discovery

    • need a definition to be clear on DDI usage. Ability to uncover resources described using DDI metadata.? Identifying programmatically the relevant resources (datasets, studies?) for a specific research purpose. (from DDI-RDF vocabulary web page). ‘Find’ part of FAIR.

  • Dissemination

    • From Lifecycle. Focus on usage in DDI context. See paper, page 5 terminology with possible definition.

  • DTD

    • Acronym, document type definition. Document description language that largely predates XML schema. First version of Codebook was a DTD. Note that term has been superseded/is archaic. See DDI taxonomy page.

  • External reference – link to resource that is external to metadata instance (e.g. a vocabulary concept) (2022-01-26). needs thought…, review. need to clarify if is reference to DDI concept or external to DDI. both the DDI technical sense of reference to other DDI metadata, and a general sense of reference of non-DDI things

  • Genericode

  • Identifiable

    • used in Lifecycle, a class of things that have an identifier. Motivated by need to reference or reuse some information object.

  • IHSN toolkit

  • Inclusion inline vs. by reference need to look at how this is presented in specs, but need clarity on external reference, Internal publication of DDI schemes; glossary should have the same term (label) that is used in the specifications. [make positive statement to effect ‘ddi lifecycle XML uses references between instances and sources of metadata to enable reuse. Publication of DDI schemes supports this functionality’]

  • instance variable - variable in the context of a particular dataset; define with this approach-- ‘conceptual variable is…’, ‘represented variable is conceptual variable with…', ‘instance variable is a represented variable as used in a dataset…, denotes inclusion of information about source of data (context…). ’ Inherited from GSIM. appears in DDI lifecycle and and CDI. Most granular element in variable cascade.

  • instrument - implementable mechanism for collecting data. Notes - typically a questionnaire or sensor;

  • Interoperability - as defined in the FAIR principles (with a link to GoFair Principles - https://www.go-fair.org/fair-principles/ ). several aspects: data, instruments, semantics, system, syntax. The capacity for systems (things, agents) to interact meaningfully and correctly. [System is construed broadly to include any kind of interacting agent..]

  • key-value data (datastore, structure) - data in which each value is associated with an identifying field (string). Add note that identifying field (key) is considered unitary; key does not imply any internal structure

  • Lifecycle - see above (that is, Data lifecycle, DDI lifecycle, Survey lifecycle per GSBPM)

  • Logical record - the schema for the content of an information item (record), in contrast to physical record. Tells what is in record, how they are related. Physical record defines format, specific representation. NOTE also look to see if GSIM has a logical record - if it does, need to mention it. Need to investigate if there are inconsistent uses of the term in DDI standards, and point these out.

  • Long data - like an event history; rows are variables, column has unit. Allows adding new variables without adding columns. Similar to rdf triples, 5th or 6th normal form dbms. Need to be able to reference registry of variables.

  • Machine-actionable - see https://ddialliance.org/taxonomy/term/198 “information that is structured in a consistent way so that machines, or computers, can be programmed against the structure.” [finish here 2022-05-18]

  • Maintainable (still used in DDI Lifecycle - like a database table of items that are maintained as a whole)

  • Major version

  • Minor version

  • Metadata - social science, behavioral definition

  • NADA cataloging tool - NADA is an open source microdata cataloging system, compliant with the Data Documentation Initiative (DDI) and Dublin Core’s RDF metadata standards. https://nada.ihsn.org/

  • N-Cubes - multi-dimensional data cubes used in DDI Codebook

  • Nesstar - even though it is a software, it is used for DDI

  • Physical record - physical recording of the values of the logical record

  • questionnaire

  • Register - a list

    • administrative data that holds info that can do research on the subjects, eg, tax, births and deaths

  • Registry - catalogue that can find data, eg, SDMX

    • ISO/IEC 11179

  • Repository - place where data and metadata holdings are maintained and distributed, eg., an archive

  • representative variable

  • Resource package - in DDI Lifecycle, a special construction, not for a specific dataset

    • check to make sure it is still used

  • SDTL - [this should be in the acronyms list also] an independent intermediate language for representing data transformation commands (from III. Purpose in https://ddi-alliance.atlassian.net/wiki/download/attachments/860815393/Part_1_DDI-CDI_Intro_PR_1.pdf )

  • series – use definition in DDI Lifecycle

  • statistical classification – add note on clarification of ‘statistical’ vs. other kinds of classification. Classification should be exclusive and exhaustive. See usage in XKOS, CDI, Lifecycle, and GSIM.

  • study

  • survey

  • unit of measurement

  • unit type

  • universe

  • URI - in relation to DDI

  • URL - in relation to DDI

  • URN - in relation to DDI

  • variable cascade

  • variable types (in context with Lifecycle, Codebook, …)

  • Versionable

  • Versioning - a technical specification in DDI

    • talk about the different specs

  • wide data

  • XKOS - this should be in the acronyms list also

    • this is a DDI product

  • XML Schema

 

Possible Terms to add

Acronyms

  • CDI

  • DDI

  • DISCO

  • DTD

  • ESS -- European Social Survey -should be in list of acronyms. not really a concept (2022-01-26)

    • do we really need this - how is it related to DDI?

  • IHSN

  • SDTL

  • XKOS