DWD Meeting Minutes

Link back to Main Page for DWD

 2017 03 13

DDI Workflows for Dataverse WG - Initial Meeting - March 13th 2017

Present: Johan Fihn, Olof Olsson (SND), Esra Akdeniz (GESIS), Marion Wittenberg (DANS). Larry Hoyle (University of Kansas), Sebastian Karcher (Qualitative Data Repository, SYR), Danny Brooke, Julian Gautier, Gustavo Durand (IQSS, Harvard University), Kevin Worthington, Amber Leahey (Scholars Portal, Ontario Council of University Libraries)

Overview of group: The group is gathering to identify opportunities and use cases for enhancing support for DDI in the Dataverse system. Including discussion of organizational use cases, workflows for metadata from both researcher-generated, and library / curation manager/ archives mediated perspectives. Discussion about priorities for DDI enhancements in Dataverse that will lead to greater adoption and support, researcher usefulness, etc.  for DDI and Dataverse.

Meeting Notes:

  • Group background  (EVERYONE)

    • Interest from a variety of organizations including national data archives, libraries, Dataverse development team, repositories, consortiums:

    • Scholars Portal, SND, DANS, GESIS, IQSS Harvard, CESSDA

    • Interest in us coming together at NADDI, IASSIST, and Dataverse Community Meeting to consolidate and discuss use cases and priorities

    • Many of us have existing DDI repositories/tools, looking for a way to represent / migrate these metadata holdings for discovery and access in Dataverse

    • Marion is working on a Dataverse proposal for CESSDA

    • Esra is working on a metadata research project and taking a closer look at Dataverse

    • Scholars Portal is developing a Dataverse app for exploration of variables using DDI metadata

    • DANS and SND developing national infrastructure services that include Dataverse as a repository

    • Some of us are using DDI Codebook or DDI-like metadata in separate repositories (e.g. Nesstar)

    • SND is using DDI Lifecycle in a local repository

    • Some discussion about the potential for reusable metadata (DDI Lifecycle) in Dataverse including separation between concepts and variables, and questions and variables, that can be reused by someone as part of the study design stages.

    • Marion mentioned that CESSDA is developing a Question Bank and so might not need the ability to reuse metadata in Dataverse


  • Use Cases (EVERYONE)


  • DDI Import (metadata transfer)

    • Need improved tools for uploading DDI XML in Dataverse UI (for researchers, support staff);

    • No current comparison list of support DDI elements between repositories that could be helpful for identifying gaps in support;

    • Need improved tools for migrating DDI from other system into Dataverse (e.g. through command line import,  API, migration scripts)

      • Support DDI-Codebook (Nesstar flavour)

      • DDI Lifecycle (DDI Lifecycle 3.2; local profiles)


    • Metadata reusability

      • Could dataverse support reusable metadata for researchers to search and reuse in the design of data (e.g. for reuse in questionnaires, variable coding, value coding, etc.?)

        • Some discussion from group using DDI Lifecycle metadata to improve distinction between concepts and variables, enable discovery of concepts across Dataverses (e.g. Gender (or Sex) concepts and variables to represent those concepts could become reusable or discoverable)

        • Reusability factor of concepts, questions, variables, value lists, etc.

        • Could enable attribution of reusable metadata (e.g. standardized codelists (such as NAICS classifications), attribution for concepts and implementation of abstract concepts)

  • Harmonize by-design

    • Similar to metadata reusability; makes metadata items reusable for researchers during the conceptual and study design phase of the lifecycle.


  • Variable-level DDI

    • Improved description of variables in general:

      • (e.g. frequencies, summary statistics, missing values, questions, concepts, universe notes, units of measurement, routing / skip logic (questionnaires), mappings between questions, concepts AND variables, provenance at the variable-level)

    • DDI Lifecycle has support for distinction between questions and variables, and questions, variables, and concepts.

    • System could benefit from more access rights (metadata) being captured at the variable-level; could be incorporated into DataTags workflow? To help with understanding restricted data access; master copies, access copies, etc.

    • Attribution at the variable level (e.g. stop the naming of variables “ _____ scale of …”)

    • DDI XML for support of TAB file downloads; otherwise the TAB file isn’t enough, maybe DDI could be written to a setup file structure for reuse?


  • Support for DDI controlled vocabularies

    • In general more controlled vocabularies could help with metadata generation

      • DDI controlled vocabularies would help with filling in DDI

      • Could be enforced

      • Support for local controlled vocabularies

      • Need for multilingual controlled vocabulary support (also need for multilingual interface; Scholars Portal working on french interface)


  • Support for Local DDI Profile

    • Customize metadata fields and potentially prepare templates for local DDI Profile upon configuration of an instance

    • Or support DDI profile as a template at the DAtaverse level (bring back templates)

    • (E.g. CESSDA, ODESI, and others have DDI Profiles or DDI Best Practice Documents)


  • Improved variable and metadata visualization

    • Scholars Portal developing DataExplorer app that uses DDI metadata to explore variables in context of dataset

    • Identified need to improve discovery of variables in the system, with more DDI metadata (for tabular files)