TC Meeting Minutes 2022-2023
Wendy Thomas
Earlier Meeting Minutes:
2020-2021 Minutes Page, 2018-2019 Minutes Page, 2016-2017 Minutes Page, Pre-2016 Minutes Page
ATTENDEES: Wendy, Jon, Dan, Olof, Flavio, Jeremy
Excused: Oliver
DDI Lifecycle version 4.0 BETA
Jon and Dan are checking it out in EA by entering and then exporting a single instrument
Announcement content:
Add UML output
Change to 4.0
Info on free account on github
Link ML is included in the full download package but these are ones that are the official ones
There are 2 different owl publishers (closed generated by by owl publisher) others generated by Linked ML toolkit
Maybe for high level documentation we can pull background information together as well as address questions raised
Implementation of serializations was focus of former high level - need to determine what is model level
Final steps
Dan will merge UML updates in shortly and tag a package for release
Dan will inform Wendy to send out and copy to pages in Confluence
Plan in some web presentations/ Q&A sessions/ - Jon will come up with a schedule of promotional/informational events (first meeting in Jan)
Make comment period through end of March
Change to Beta version in documentation
Check out ability to put preview status
Name of the release will include Beta (1,2,3)
Jon will get it out on social media
Updates of ddialliance site pages
Get Wendy the appropriate links for updating pages
Update problems related to these links from Google - should go to current page
DDI-Lifecycle | Data Documentation Initiative
Olof Olsson 9:23 AM
DDI-Codebook | Data Documentation Initiative still have "DDI Codebook 2.5 (under review)"
Next meeting: 4 Jan 2024
ATTENDEES: Wendy, Dan, Olof, Oliver, Flavio
CDI till mid January
Olov moved rest of materials to GitHub
Beta review of DDI-L version 4 out next week
Merged Olof's work on LinkML and will test that out
File copied of documents on Beta review on Confluence site so they are easily aviable.
Get this out next week for Beta review
Send to: DDI-User List, Developers groups, DDI-SRG
Oliver will catch up with Darren again when he is back from vacation (no issues, just follow-up)
ATTENDEES: Wendy, Jon, Dan, Jeremy, Olof, Oliver, Flavio, Christophe
Review pull request for issues #12 and #19
Language tagged strings and documentation, referencing subtypes of versionable by DanSmith · Pull Request #20 · ddialliance/ddimodel
Walked through pull request process; validation, build outputs, see what exactly changed
When pull request is added we can set required reviewers prior to merge
Merge pull request #20 from DanSmith/master · ddialliance/ddimodel@486b43eDraft of DDI 4 beta release announcement
Timing: Already working in Olof's repository (no restrictions) so need to find out when this can move. Two versions of OWL, straightforward, and with restrictions on domains
restricted model says hasConcept can only be used on items variable and category (only place predicate can be use)
Get feedback on both approaches in OWL (simple OWL - Olof; add in and put in build output)
If we get done before the holidays we can get a release out.
Content: create a draft release on Github and create notes
Email draft: email content and draft of any additional information not in content documentation
Complete draft by next week and add to meeting agenda.
Plan timing of DDI 4 beta release (next week?)
Complete draft by next week and add to meeting agenda.
Want full history on Bitbucket and clone to Git
One repository per product
CDI directly to github and remove bit bucket
Production is not already pipeline on CV so that can be moved
Oliver will double check with Darren if there are any transfer issuesdoes anyone have any work that is not pushed
Wendy will push to dditcIf there is time I'd also like to review the infrastructure plan including, process for implementing, and timeline issues
What does this mean for the timing of support of documents ion serving
Neat to have this when the beta release went out
Serving line documentation and high level documentation
How is lifecycle updated (in terms of builds) - what are the options in terms of published documentation and documentation of new builds (need to provide access to both)
<https://doc.ddialliance/<product>>/<version>where we always have:<https://doc.ddialliance/<product>>/masterto reflect the current state of master?
Jon will speak to Jared and make sure we have domain names for documentation
docs.ddialliance.org to point to the documentation server (DNS change handled by ICPSR/UMICH)
Newsletter article
Others will review. Jon will add info from today ad
ATTENDEES: Wendy, Jon, Dan, Jeremy, Oliver, Flavio
Roadmap document
-- get this done WENDY
EDDI TC meeting agenda
-- overall
Lifecycle get to the point where we can get something out to look at
--Schema changes, COGS stuff, and specific issues about objects
Codebook handfull of issues
CDI - what needs to get done
Controlled Vocabulary - what's left
Infrastructure needs piece
TC Organization paper
Two basic models
-- Bit of a problem with the product working groups reporting directly to the Scientific Board due to the workflow of product releases through the TC
-- The important thing is that the TC has a responsibility for the infrastructure
-- Autonomy of the working groups retained but clearer relationships [rationalize relationships]
RDF common set working group
Those who expressed interest have received draft of announcement. Please review so Dan can send out
IASSIST
No DDI meetings at IASSIST this year
No TC members have submitted presentations - please notify us if you do
ATTENDEES: Wendy, Jon, Dan, Jeremy, Flavio, Darren, Christophe, Oliver
Review of CDI processing
Have a list people to contact - recheck with Arofan for any additions
Time period is decided
Need text for email
Get issue filing information updated on site
Updates from Flavio this week
Generate packages and get to TC
So still possible within November
May be able to get out by EDDI with a 6 week review period
OK on the approach for issue filing (Jon still needs to verify that general Atlassian login filing works - don't anticipate a problem)
PROPOSAL
On product page:
replace green box with section:
File an issue on this product
If you have an Atlassian account you can CLICK HERE to file an issue.
If you do not have an Atlassian account you can create a new account and then file an issue or use the DDI submission form HERE [link to the File and issue page]
File and issue page:
If you have an account with Atlassian click on the product below to file an issue.
If you do not have an account with Atlassian you can {create an account with Atlassian} [link to Atlassian site] then return to this page and file your issue.
Don't want to create an account? Use this Google Form to provide information on your issue and we will file it for you. You will be provided with the Jira issue number once it is filed so you can view comments and decisions on your issue as work progresses.
If you are not sure which product your issue relates to, file it on the Technical Committee issue tracker. We'll make sure it gets to the right project or projects.
DDI Codebook
DDI Lifecycle
DDI Cross Domain Integration (CDI)
Controlled Vocabularies
XKOS
SDTL
Technical Committee
New working group on RDF (Dan)
Working on announcement on new working group
Look for that in the next day or two
Codebook
3 Codebook items - send Darren a reminder of numbers
Review process for Codebook entries - do remotely maybe hopefully PLEASE
CV releasing a new version resolving SKOS issues
Plan to get our system going on the 23 Nov
Oliver would be able to rerun at any time
Coordinating with them about any static information that needs to move over from DDI site
COSMOS - Admin Data workshop - has it been set?
Been difficult to nail down the meeting coordination
Flavio is in group that is coordinating meeting
Mon/Tues before is a time that CDI group would like to have meeting
Could be Mon/Tues after COSMOS
NEXT MEETING:
Roadmap document
Draft agenda for EDDI meeting/TC
Paper about ways in which working group could operate
No meeting in November after 16th; no meeting on December 28th
ATTENDEES: Wendy, Dan, Jeremy, Oliver, Christophe
Alternate Issue Submission form
Use as a back up as option for subject matter persons. This should be prominent on the pages. The zen desk at Colectica is also a good solution (internal user support). It is possible to bring this more to the foreground. Review ability to non DDI Jira members to file ASAP.
[Added noted 2023-11-08: Changed permissions to allow any Atlassian account holder to create issues. Identified page for information on creating an Atlassian account. Create an Atlassian account | Atlassian Support }
Codebook.
DDICODE-535
Item Adrian-4 #48
Would be difficult to define which format went with which file. If we want to provide multiple file descriptions. Everyone would have to add in the logic to figure out the relationship between the file format and the specific file description. The PDF codebook might contain more than one format but would always reference which of the file was being described (link between varFormat and physical file). These issues were why we created lifecycle. Need to be clear about purpose of codebook (1 datafile/dataset in a single format). varFormat provides additional information on the formating of the variable within a specific file.
How to describe multiple storage formats of a single study description/variable description. Add to Guide/Best Practices content.
DDICODE-534
see issue
DDICODE-537
see issue
Funding request to support IT infrastructure for development tools - maintenance, upgrade etc.
Does the RDF server fall under that? yes
Better description on what types of things would be covered, how tapping in or approval of funds would work. Opportunities for paying for some CI stuff for running serializations. For instance CI builds are running on Dan's personal account, but this should be moved to DDI github account.
Add to agenda for discussion at EDDI meeting.
ATTENDEES: Wendy, Flavio, George
discussed DDICODE-536 and DDICODE-537 see comments in issues
Items for a process derivative products
canonical UML model
go to Library domain model UML class diagram example describes main library classes and relationships.
Derivative of the process we use "canonical UML"
Set up the review and publication process for these process derivative products
Cross DDI product toolkit for working with primary products:
DDI profiles
canonical UML
UCMIS
Transformation tools between products
ATTENDEES: Wendy, Dan, Olof, Jeremy, Flavio
Excused: Oliver, Christophe, Jon
Published Agenda
Production pipeline (Oliver's pptx has been added to TC drafts folder (PipelinesInDDIProduction.pptx
I sent out the CDI Syntax representation document separately - we need to look at this and prepare for the syntax review (I believe there may have been some done at Dagstuhl and we need to take that into account
Tool discussed at last weeks meeting (dependent upon Olof's availability)
CDI Syntax Representations
The document provides the content needed for a technical review of the bindings
Pretty clear on each of the steps
UCMIS purpose and usage explained
Updated package with small changes (cardinality, directionality, etc.) with updated change log
Documentation update and model update - we should get this sometime in early November
2 months may be a reasonable to get back by mid-January
Be ready when we receive it and can extend 2 weeks if needed
Question regarding use of other RDF vocabularies
If we want to incorporate DC vocabulary etc that need to be moved back into the model
For example we moved some data types back into the UML model
The different syntaxes can have rules for how to pull these in (external namespace or DDI namespace)
It's easier to annotate at the model level
This question is the purpose of the new working group
At least needs to be denoted in the model and manage the representation usages and relationships
There were discussions regarding the acceptance or problems with use of external vocabularies
The idea of a platform specific model comes back for individual syntaxes like OWL or JSON-LD (only for those that needed them)
Has been under debate for some time and RDF group will discuss this. In the mean time this is how it is being done.
Codebook update
Requirements for publication
Process and tooling
Oliver's production process slide
Schema changes are being entered followed by documentation updates
Adrian Dusa’s comments have been very helpful in identifying specific issues
Tool developed at Dagstuhl
GitHub - ddialliance/ddi-cdi-sample-generator: JavaScript example application to generate ddi-cdi (json-ld)
JSON-LD sample generator from CSV file using a view application
Provides an interactive production of CDI
Thought is that this should be expanded to Codebook and Lifecycle
Work should be done in the Developers Group
Aim for this is to show implementors how to implement
Olof has written a proposal for a new tool for multiple products
The NextDoor Publisher (Codebook, Lifecycle, CDI - export only)
Hope to have an early version out in a few months
What is needed for the sample generator and variable descriptions
Ask to file documentation issues in product or TC JIRA
Example implementation of CV in products
How to use Codebook - will have a best practices so this can be used to capture this content
ADD TO PAGES: Point at tool from each products. Would be a good tool when deciding
This is great. It is something we have been missing for years
For import Colectica basically does what NESSTAR supported plus whatever has been asked for
There is a mapping on the Colectica page - it is a listed task for Colectica to get a complete mapping and this would be useful in NextDoor development
There is an open issue tracker of things Olof and Oliver think need to be done. Add if needed with details.
Issues · ddialliance/ddi-cdi-sample-generator
NCCSV
FUTURE AGENDA ITEMS
Including data in DDI - interesting question regarding a disconnect between data and metadata
Dan has submission of paper for COSMOS (not DDI specific)
Dataset in Lifecycle is not really used because it requires understanding of DDI to unlock the data
Using front matter on CSV with commented out JAMAL etc.
A good topic to discuss over drinks at EDDI
Double check with Jon regarding TC presentation at EDDI - put on December agenda
Next week no Dan or Jeremy
Codebook updates next week
ATTENDEES: Wendy, Jon, Darren, Oliver, Christophe, Flavio, Jeremy, Dan
Detailed agenda and outcomes of TC meeting in Ljubljana
Discuss options paper around work group organization - Focused discussion.
Where we are with Lifecycle - get it back on track
We have to still make the decision about embedding xhtml imbedded in schema
Update the converted lifecycle model to support extensions from privative types (value on class) - either not extend and make properties or change the tooling
FUTURE agenda item: change tooling or adjust model - this is a one time import
FUTURE agenda item: Dublin Core and XTML embedding in the specification - needs to be in consideration of multiple serializations (RDF, JSON, etc.)
Review of RDF specification - Pierre Antoine was going to add tickets (before or at meeting)
Property (source variable can only be used on a certain class)
Goal is to get something that is ready to move in a level review for those outside of TC
RDF group is getting started in the next few months so not much for the TC meeting
Participants: Wendy, Jon, Dan, Jeremy, Darren, Olof, Christophe(?)
Update from Scientific Board Meeting
sub-groups vs working groups for product development (when they should be sub-group or working group)
Make this a future agenda item regarding this - between now and the TC meeting we get options and implications written down and then presented to SB
Update on CV stuff
There has been some improvement by developers in Slovenia
They changed something on the identification of concepts that now gives persistent non-language identifiers which means we no longer need to tweek the RDF and can just port it over. Leaving the transformation to codelist and html page
Can probably wrap up in November (still a few bugs)
Quite a lot of issues flagged in CV group itself - Darren should look at this to see if there is anything that we need to worry about
From technical point of view, we should be able to put this up in 4-5 weeks
Production of CDI
Talked to Achim about getting information about pipeline
Diagrams of the workflow of different products (Lifecycle, Moving Forward, SDTL, CDI)
CDI is manually moved through via Achim
Can make possible upgrades by adding multiple access to EA and all should be able to extract the XMI and create a pipeline of the steps following that.
Canonical XMI should be easy to pipeline. Acceleo requires a java library which would need editing at any changes. Sphinx could be pipelined
Could pipeline some of this - main thing is to get this in a check-in check-out github approach
Very few people usually editing in EA and they have to coordinate
EA is not manageable through GIT
Dif the XMI is technically possible but its a pain
Has to be forbidden to check in different models of EA and XMI. Never a commit of just one of these files. They HAVE TO BE IN SYNC
Diagrams are missing from this model - go into documentation
Oliver can make these available for a discussion - including future pipeline and getting Achim to correct where needed
Acceleo - main idea is that it is an eclipse plugin that can use a transformation description to create RDF, XSD, etc.
To make this into a pipeline you need to surround transformation scripts with something that is possible to run headless
Tool from Dagstuhl
Future agenda coordinate with Olof
ACTION: Firm up the draft road-map so that it clearer and makes prerequisites clear. Add to Face-to-Face TC meeting agenda
ATTENDEES: Wendy, Jon, Dan, Christophe
Scientific Board agenda
Has CV been contacted
Scientific Board Working Group Contacts
Darren is contact
Training Group
Who needs to review what? TC does the technical review, what does the SB review or need to review? Structure of the training group and various sub-groups within the Training Group and how that should work.
Working group proposal on Data Capture (Questions and Questionnare work from Paris EDDI, Codebook interest in expressing questionnaire, provenance of data, etc.)
World Bank has expressed interest in a descriptive entry for questionnaire
Insee has a strong interest in this area - expressing specific technical features
Data capture in general
Roadmap
Indication of Lifecycle 4.0 would look like
Codebook change is it a possibility? Is it something that is really a profile of Lifecycle?
Boarder discussion of identification and versioning when it's required and what that means. Multiple serializations raise the question of what is a reference what is a structural relationship, versioning.
Next meeting agenda
Get a sense of what we are going to be doing in Lubjiana
Dan and Jeremy (?)
Wendy
Jon
ACTION: Common concepts and RDF - Dan will be submitting a plan for the working group and can include Christophe and Flavio as an initial member.
ATTENDEES: Wendy, Jon, Dan, Jeremy
CDI update
Production information is being finalized by CDI group and should be delivered to TC in the next few weeks
TC needs to prepare for the review of the implementation formats for accurate rendition from UML XMI
Prior to vote CDI needs to prepare a presentation for voting members
XKOS Best Practices comprehension review
Still seems to be a problem with google to icpsr routing for DDI-SRG
Check out and inform Jared
Have Christophe resend
All TC members should review this document and comment as needed
CV resolution
Progress in URL production from CESSDA tool
Once all points resolved and verified, we can move forward on live resolution system
Keep moving on Codebook
ATTENDEES: Wendy, Dan, Flavio, Darren, Oliver, Jeremy
Excused: Jon
Roadmap development
Additional materials were added and a listing of practical steps added prior to meeting
Regarding Practical Steps:
Production tools - Codebook is there something we can do to support better automation of the process
Make some notes on what documents contribute (StatsCan - example of what someone else is doing)
roadmap is kind of a GSIM view
Data Platform - remove
Metadata repositories (StatsCan)
ACTION: After completing changes noted above, send links to Darren and Hilde for SB
CV resolution system -
In reading the GitHub issue dealing with URI correction - not clear if this will resolve the issue
Oliver and Darren are looking at this (both will be in Dagstuhl to discuss)
There are still issues, particularly in terms of language management which is different from the DDI approach
Its a complex pipeline and much of our transformations address changing output from CESSDA - this could be a lot cleaner
Continuing issue of internal communication and changes occurring without consideration of issues that it causes DDI
It may be easier to pull out of the CESSDA CV manager
TC needs to pull together a proposal concerning long-term support of CV manager
It would be good to have the lifecycle codelist version (these are current products of our pipeline
Oliver will send Dan example of codelist output to review - the following link provides access to all outputs of the pipeline that transforms CESSDA output
ATTENDEES: Wendy, Jon, Darren, Dan, Flavio, Christophe, Jeremy
XKOS Best Practice paper comprehension review
Note that XKOS is the first document to go through the technical document review process and is being used to sort out the details
Document is located at:
link-statitics.github.io/skos/skos-best-practices.htl#bp-labels
Comprehension review:
Intended audience - the question raised in a comprehension review is whether the intended audience understands what is being covered and the intended guidance
Review process - need to identify intended audience, length of review period, how comments are collected (these will vary with the document)
Announcement of publication
WHO: DDI and others known user groups
WHAT: Description of coverage should be in the announcement
[notes interrupted due to technical difficulty. connection reestablished during a discussion of some questions the TC members had regarding use of time stamp vs. version number]Date stamp only for Best Practice, no version
The Technical review was done earlier
The comprehension review should address the following
Clear what is covered and what is not
Are the options and recommendations clear
Audience for review: NSI group, EUROSTAT working group, general DDI Users as a broad pass
Some stats can people will be interested
A few questions were raised based on a quick skim the document
major revisions of classification
major versions with new URIs
proposed URI patterns - a proposal or recommendation (we recommend because...)
more text around why
Primarily it was an issue of making sure that users would understand why an approach was recommended as well as what was recommended
ACTION: Have TC members read through and note any issues
Capturing comments:
How to respond - filing in github is the best way to respond
Will have a meeting in a few weeks with the European working group for suggestions and ask for some feedback on how short/long it should be
EDDI session proposal
Turn into a presentation - Wendy
Roadmap
The Scientific Board has asked for a draft or whatever the current stage of this work is prior to the in-person Scientific Board Meeting in early October
Steps:
Last road-map document
What we captured last August 2022
Go through past minutes
Draft - pull it together
Chur - document
Put on agenda for next week
Concept/ConceptualText/Controlled Vocabularies used
next week
ATTENDEES: Wendy, Jon, Darren, Oliver
REGRETS: Dan, Flavio
EDDI 2023 session proposal
Need to add specific presenters/titles/short statement
See document: EDDI 2023 Proposals
Codebook: Concept/ConceptualText
Suggestion to keep these separate where concept adds all of the specific URL/URN links and a place for the "code". This way the label can go in the textual content of concept (all locations need to be repeatable) and the text portion of the conceptual text retains it role of containing general text related to the parent tag. This was original a means of being able to add specific concepts to large textual pieces rather than use of the text portion to provide labels or description of a concept.
Need a guidence document on this that also pulls in CESSDA discussions and SKOS provisions. Focus on backward compatibility, ease of identifying labels (as opposed to broader descriptive text), and clear transfer of information between Codebook, Lifecycle, CDI, XKOS, CVs, etc.
https://github.com/cessda/cessda.metadata.profiles/blob/main/CDC_2.5_PROFILE/cdc25_profile.xml line 980 for example
ACTION: Wendy will write up and Darren will review and edit
ADDITIONAL INFORMATION:
Jon submitted the proposal for a TC side meeting at EDDI (number 28)
Darren will let TC know if there is any material they need to prepare for the in-person Scientific Board meeting in early October (following next week’s SB meeting)
ATTENDEES: Wendy, Jon, Dan, Jeremy, Flavio, Oliver
RDF Union Model Working Group:
Dan will be sending out an email to user list asking about interest for new proposed working group on RDF
Do we need a separate official work group for this or just a sub-set of the TC
Easier to bring other people into from the outside
Does it have more of a life if its a formal committee
Relatively informal thing and if it gets legs then think about setting up a group
Dan will write up and we'll start with it informally
General comments on working groups and coordination:
TC's role is clear but needs to be clearly publicized
Is it helpful to have all these working groups (product and topical) if we can't keep them coordinated
Topical groups in terms of new coverage
Silo's of products
New content coming out of product groups can cause issues when coordinating across groups
Groups need to have a roadmap for what they are doing and where they are going
New areas need to be exposed to others
ATTENDEES: Wendy, Jon, Dan, George, Oliver, Jeremy, Flavio, Christophe
Dan's email regarding RDF union model
Issues:
Is there enough commonality between -yes by intent and at a higher level Concept-->Specialized Concept-->Specific content
A more finite set then what Flavio and Wendy are looking at
Common representation at at least a specific level
How atomic objects are put together (similarity/differences)
What content is transferable what is specific to a product due to its use
Is there a role for Disco in this - implications in terms of Disco as a published product
Really requires a separate working group - need to refine the focus
RDF (initial but not sole outcome)
Harmonization/higher level model for DDI
It makes sense to have a dedicated group - task group but could continue as an ongoing support product
Define ties to TC as the product coordination group
Need people who are familiar with all of the different products - Codebook, Lifecycle, CDI, etc.
ACTION:
Draft new working group description - Dan will write a draft (Wendy will point to documents describing what is needed)
Can we do a combination so that there is some short term payoffs and then look with more time - how much is TC still doing and feed this into this group
Charge should focus on finding commonalities and objects that are and can be shared among the products
Group shouldn't start out looking at ontologies but at commonalitiesHow the group is populated
Overall goals for DDI Suite integration/interplay:
Move for products to have more common ontology/common objects
This could be a new core to align products
Facilitate the movement of content between products (working group would support) - product specific content
More consistency in future
Principals - clarify
Coping and handling various content problems (ICPSR, CESSDA, WorldBank, etc.)
Extending or surrounding - what gets covered by which parts of the suite
Need for tooling for: decision support, content transformation (to consume content and transfer), other?
What working groups would support (new group) - comparability from top to bottom
Jon's experience in Codebook to Lifecycle transfer - what are the problems, how do we develop the products in a way that ease this problem
For the last decade DDI has lived in this dualistic universe which had a tense relationship and now are becoming trinitarians with CDI. Need a credo that states there is a DDI reality underlying this and needs to be viewed as part of a broader whole.
This session should probably be written up as an article for IQ
Post meeting discussion between chair and vice-chair regarding implications for TC role in DDI:
Roadmaps for each product - what we want to do, how does it align with other products?, should it be in more products?
ATTENDEES: Wendy, Jon, Jeremy, Oliver, Flavio, Dan
Lifecycle 4.0 preparation
Start with the Bugs and move to the improvements
Add column E information comments to Jira issues. Jon will start looking at the bugs.
Presentations for EDDI
Update from TC - slot which can cover what we need it to
Could do something on (training or session) Codebook 2.6 or CDI
Common ground of products - Flavio and Wendy / presentation preparation - session rather than presentation
granularity
applications needed to support
implication for movement between others
Referencing from DDI and other products
Classification areas
TC meeting
We need to have very specific outcomes.
The people we need are the people around the technical committee
Olof should be invited
Flavio reviewing the UMI (XMI output)
RDF tasks - making the URIs for types of items across products (non-product specific) not planned for a time so we need to look at when this can go on the schedule (main item types only). Flavio/Wendy model work can feed into this in terms of identifying common elements. We can get clearer on this as we go along.
Are there validation implications between different syntaxes (XML, RDF, Json, UML/XMI, etc.). Implications for use, provide shape to various scenarios.
Check back in during September to see where we are and make we keep tightening this up.
Future meetings
Codebook work primary topic for August
Cancel next week due to attendance
ATTENDEES: Wendy, Jon, Dan, Jeremy, Oliver
AGENDA
Update on CDI work and technical review work
TC meeting - in context of EDDI
Codebook work - remaining few issues, comments from Darren if available
August/September/October work schedule
CDI
CDI draft of materials they have pulled together regarding process looks like it is covering everything we've asked for
End of August is the earliest we will get this due to vacations etc. This works well for setting up technical review. We should be ready to go with this soon after we receive information from CDI.
ACTION:
Email Arofan with one added item for common approaches (use of External Controlled Vocabularies) as well as the goal to outline parameters of technical review before we receive final materials.
EDDI - TC meeting
TC Meeting - agenda
Pick up areas where there were issues where remodeling
Were the decisions we made workable
ACTION:
Jon and Wendy will start drafting the agenda for the meeting and we can refine as we go along
Activities from Aug-Nov:
Dan COGS stuff in August/September
Go over outstanding LIFECYCLE issues for inclusion in 4.0 and getting that work done
4.0 and 4.1 applicable issuesCodebook - Schedule time ASAP August/September
Getting schema changes done in August - September finalizing documentation and process info
Flavio and Wendy - models
Drafting out when and how to roll out separate lifecycle and codebook groups and then what TC looks like after
RDF work that came out of EDDI last year - look at this after next weeks review of 4.0 issues
ACTION:
Send Jon content for Codebook high level documentation
ATTENDEES: Wendy, Dan, Jeremy, Oliver, Darren
DDI Codebook 93, 95, 97 - reviewed (see JIRA issues for comments)
All issues were agreed on in general and only specific need to be determined (exactly how changes will be entered)
Darren will look at details and provide comments on how to enter
Lifecycle 4.0
Review of issues to identify which could/should be made it 4.0 especially due to required structural changes
Review of subsitution groups and other obscure XML structures
Test for round tripping between 3.6 and 4.0
ISI World Statistical Conference
ISP session on DDI 20 year cooperation with Statistical Agencies
When presentations published I'll provide links to DDI world
ATTENDEES: Wendy, Dan, Flavio, Christophe
CDI
New XML examples
Cardinality issues for wide (having to have at least one data point)
review to see if there are
List of consistent items requiring comment:
Identification
Referencing
Sequencing
External Controlled Vocabulary usage
Ability to run the script (UCmss)
Templates included in a repository link
Run in Eclipse add-in (open source tool)
Mapping of serializations -
Why the ontology's were selected (other RDF languages)
First impression:
All properties are unique per class but can make querying the model is more difficult. (why was this decision made)
The domains of each property are unique
No mention of cardinalities in the OWL (this is available in OWL)
Not putting cardinality in but putting in a SHAQL or SHECKL
Question about cardinality of identifiers
ACTION: Dan will write up a list of comments or questions regarding sparql query language is pretty powerful. Serialization is written in such a way that makes this difficult to use.
Consistent property names in past version
Codebook Status
ACTION: Review for change to resolved status
Lifecycle
Milestones - 4.0 (structural change)
Milestone - 4.1 (content updates)
ACTION: Review open issues (fixes that can to in 4.0; what needs to wait; what requires long-term discussion)
ATTENDEES: Wendy, Jon, Oliver, Christophe
Focused Technical Review for CDI representations
Reviewer Suggestions:
Olof Olson JSON-LD (all is the request)
Franck Cotton (turtle)
Oliver Hopt
Benjamin Zepilko
FORS guys nudge specifically (from developers group)
Deirdra Lungley
Sam Spencer
[Christophe and Jon will check with a few within their organizations]
Label to use for issues filed in DDICDI: TC_review_v1.0_rc1
Codebook 2.6 - need to get this moving again and finished up; there is immediate demand for this work to be completed
ACTION: Wendy will go through and create a list of exactly what needs to be completed and provide to Jon and others that can help with completing this
Christophe raised a question:
How do we create a mapping between concepts using Lifecycle, CDI others
Current options:
Correspondence table - simple mapping, makes use of controlled vocabularies
Array concept - create a concept broader/narrower exact/similar, subclass or reference
XKOS - probably best equipped area
Statistical Classification - check that out
ACTION: Christophe will look at current options and then file a TC issue that can be looked at across products in terms of further development
ATTENDEES: Wendy, Jon, Dan, Christophe, Oliver, Flavio, Jeremy
Scientific Board - please provide Jared with any recommendations for the open position in the Scientific Board. Please verify that the person you identify is insterested.
CDI review
The materials to be included in the review are all in the DDI-CDI repository: ddi-alliance/ddi-cdi
Assumption: CDI will fix known bugs prior to publication vote, requires a new package version. The time involved with having the technical review of the production process provides time for fixing known bugs and any reported during review.
Timeline
June-July get out for technical review of production process
Provide 2-3 weeks plus taking us probably through August
CDI needs to provide a revised packaged based on bug ccorrection and review
Types of questions to consider:
Use of propriatary file structures. (md, spss.sav, EA, docx, xlsx, etc.) Should these be changed to non-proprietary or instructions provided for general access)
Diagrams are in EA - folder of diagrams (many in documentation so people don't have to go into EA
SPSS.sav could this be a set-up file with database if required
Which documents should be in the package and which should be treated as separately versioned support documents
Review interests
XML Schema and XMI - Oliver, Wendy
Production process - Jeremy, Wendy, Oliver, et al
json-ld ontology - Dan, Christophe, Flavio (ask Ben Z.)
field-level - Jon, Oliver, Wendy
High-level - Jon, Wendy, Flavio
second half may be weaker than first half
Directions for commenting
Where should we make notes issues and questions?
Specific issues in JIRA tracker - Use existing CDI tracker with labels TCReview, other one for public review
List of questions/comments we can go over - google sheet within our folder
CDI Review by TC
Limited availability over summer:
Oliver - July 1 - 18
Flavio/Wendy - ISI July 16-20
Jon July 1-24
Christophe 2 week in July and one in August
Dan won't be at TC Septemberr 28
ATTENDEES: Wendy, Dan, Jeremy, Flavio
Issues from CDI group - this could be changed prior to publication for vote
--Relaxing cardinalities in the areas of dataPoints
--Key contents
--system requirements (will be in data files but not during process)
--Understanding XML examples regarding documentation
--what put out immediately to support spec - explaining the examples more fully
--Examples still need to be passed over to TC
Topic piece for TC meeting
Probably at EDDI Slovenia 27-29 Nov (M-W, add a Th-F)
Space at MPC - as an option (talk to Cathy)
Two to three day
Flavio not sure but would need funding - Minneapolis is easier
Review of specification:
Complete except UCMIS component tooling - reinforce that this needs to be passed on
Documentation references this document so this implies this needs to be released at the same time
Need for additional review - there are now 2 RDF syntaxes are completely new and XML is created with a new tool
--serializations are now changed or are new
--specific technical review for alignments and model representation, style of generated OWL etc
--we don't know the amount of review of this - ask Arofan who has reviewed these outside of CDI that have reviewed
Would also allow them to change cardinalities prior to vote
Check documentation for explaination of role of CDIWhat has been changed more recently is the process part and should be looked at by the TC more closely, this has not been used as much as the rest
If there is a review period for serialization, it would be good to have notes available prior to that
--UML-normalized
--Normalized to OWL etc
Documentation of why and how - completing whats mentioned in the notes
There is more to review in auto-serialization process (consistency, coverage, etc.)
Track how we set this up for future reviews
--suggest specific people
--future reviews for this or other products (first serialization situations, second serializations)
Areas of review focus by members:
Jeremy in the next 2 week - XML serialization and production
Flavio - production process, whole stack to see everything is in place
Additional items of note
Reminder of Dagstuhl invitation for interest in attending
Interoperability workshop more open initiationOnce new version of Lifecycle is out we want to look at the physical description for simplification and broader coverage, NCubes, dataset reference metadata can be easily transferred back and forth. First stuff after production. Good topic for content meeting in next fiscal year requests
ACTION ITEMS:
Send request to TC members regarding review of French language versions of training slides
Inform Arofan of interest in a techncial review of serializations and auto-generation processes. Who has reviewed these to date? When will the UCMIS be available?
Send request to TC members regarding internal review of DDI-CDI package: initial questions, aspects they would individually like to look at.
Reminder of the Dagstuhl invitation for interest
Due to individual conflicts on 05-25 and IASSIST 06-01
ATTENDEES: Wendy, Jon, Dan
REGRETS: Flavio
1 Membership updates
Genevieve will be leaving TC due to change in position
Ask members to each suggest one person
2 Presentation at Annual Meeting
focus is on future focuses (see ppt) reviewed approach of stating goals followed by specific areas of work
3 Schedule of expected activities through mid-September
CDI
Codebook
SDTL - Talk to George in June about version 2
Lifecycle - COGS (Dan will be working on remaining issues)
Any on-going individual work (web pages, broad modeling)
4 Administrative
Summer scheduling availability:
Jon - gone most of June
Wendy/Flavio - out July 16-20 for ISI
Cancel next 2 weeks meeting (May 25 and June 1)
Members will be notified when CDI hands over package materials so they can start looking at them independently
ATTENDEES: Wendy, Jon, Darren, Oliver, Flavio, Dan
CV update
Several issues have been resolved at CESSDA
Remaning should be resolved at meeting
ERlang (Erikson created)
Concept URIs is all that is left to fix
This should allow for test system roll-out right after Annual meeting
May 24th meeting - roll out test and work out production run
DDI Annual Meeting
Future focus:
Technical production alignment
Benefit of production lines is to open up development to a broader group (goals of COGS)
Production costs
Profile mechanisms as a stand alone product week
Alignment of individual products in terms of overall coverage of DDI
Jira issue
Question: which of those trackers need to be public
Reasonable to have them readable public - all
Standards need to be able to file issues - we can't put them all on Atlassian
It's a low frequency of filing except when doing reviews
Many want to file issues, but can't do reasonable issues - too granular or too boad
Duplication of issues or variations on a theme
Single filing system - outside of Jira with a moderator/triage system
General lack of interest in posting but its an administrative thing
If we want to archive we can turn off public viewing
Products want to retain public review making that viewable
When someone is submitting something make sure they are notified if action takes place
ATTENDEES: Wendy, Jon, Oliver, Dan, Flavio
Regrets: Genevieve
Exec Meeting (from Jon)
Funding proposal accepted for funding
CDI update
Moving on schedule. Package to TC will contain all the items we requested including versionable package, informational pieces, and production process information
Note that all product groups will be asked for this set of information for future publications starting with Codebook
Follow-up on discussion with Daryl last week
Daryl got on both DDI-Users and DDI-SRG lists in DDI and has posted a note to those lists
He has also joined IASSIST and will be posting there also
Achim responded with additional information. I will follow-up as will Gabriel.
Mathais is still on sick leave but Oliver will ask him to just send Daryl a note that he is interested but unable to be involved for the next few weeks.
TC report at the annual meeting
We haven't officially gotten our request from Ingo
Added Christophe to the member list. Don't see need for other changes
I will talk to Ingo and Darren about any speaking time for TC. Is it needed? It would be short.
ATTENDEES: Wendy, Jon, Oliver, Flavio
REGRETS: Darren, Genevieve, Dan
Guests: Daryl Hebting, Gabriel Gellner
Developers Group relationship with TC
Main thing is to make sure there is close connection through a monthly meeting with Wendy, Oliver, Ingo, and SB contact
There is a need to ping-pong between needs for tool development from TC regarding production or user support and the interests of individuals in the group
What could be accomplished in tool implementation? The current situation, nobody sees a real chance to create any large tools because no one has an actual ticket for large amounts of their time
They have an interest to invest what time they have - concept providing, productivity tooling
Production ready tooling would require funding - Developers group could provide good sets of requirements
Plans for next Hackathon are underway (probably in Koeln)
There will be a report for Philadelphia meeting and a textual report
Oliver is covering the confluence site so this would be a good place to make clear what is being worked on, how to bring up tool requests
Good to hear its started and looking forward to great things
Good to have communications a priority
Alliance web pages - current changes and future options
If we have less functionality - we can look at what we want and make the proposal to Jared at a later date
Option would be to pay for added functionality or move it off-site ala CV pages
Need to coordinate this with Olof for the Tools page
Guest at half past the hour Daryl Hepting - Disco interest
Daryl Hepting - University of Regina
Publishing linked-open-data
Need for a universal linked information system
Looking at usability process of publishing data for citizens who have data
Making survey data from student survey available to permit analysis
Feedback has been collected on paper
Used a Jekyll website with open-ended into a YAML file
Now have output CSV that could be generalized to use in SHACL
Mapping to RDF
OpenRefin
RMLMapper, YARRRL
SCVW annotiating CSV files with separate metadata file with conversion to RDF via templates
Used Disco to define variables in CSV file
disco:LogicalDataSet link
Interested in use of Disco
Particular interest in capturing study and group level relationships
How does Disco facilitate links between this and other vocabularies for the description and access to small data sets
This may have application as and entry profile to DDI via RDF - we have always looks for ways of expanding into student and individual researchers users
ACTION:
Send Daryl information on DDI Users list and IASSIST
Daryl will post on DDI Users and IASSIST (through Wendy) for other interested in application of Disco and his survey interest
Clean up Disco content (broken links and date issue) - Wendy
ATTENDEES: Wendy, Flavio, Dan, Oliver, Christophe, Jon
REGRETS: Darren, Genevieve
Controlled Vocabulary
CVs: issue with CESSDA output has been formally filed and Darren is following up.
Issues filed, there was duplication of languages, and there is the issue of the URNs
Duplication seems to be an issue of UA vs Rest API which results in the dup language
Activity updates
CDI is keeping up with there production schedule so we can expect that prior to annual meeting.
I am getting past family commitments and will get back to Codebook
We continue discussions with the person interested in Disco. We need to set up a call with him and I need to know who to include from TC or elsewhere. [Discission point]
Jon, Darren, Matthäus Zloch matthaeus.zloch@gesis.org (see document for spelling), Flavio (will check with Gabriel Gellner), Christophe
2024-25 work priorities (google doc version for future work 2024-25 work priorities )
Production process alignments, fixes and revisions
Lifecycle and CDI will have issues along their production processes - bug fixes improvements
What are the requirements / use cases for the DDI Suite? Scoping, aligning scope with use cases and requirements, how do we track, discuss, and prioritize in a more organized formal matter
Paris meeting - RDF/OWL ontology that links together different products at the item type level (how does this dovetail or is a publication of content from the database)
Expansion of instrument to describe a tool that can be used outside of questionnaire data collection - Use of current content and the limitations of how the item is described - review of questionnaire and data capture
COGS I need actions here. This was a priority for this year and we seem to be stalled. How do we fix this?
Outputs are there right now
Review COGS tooling list of things that need to be updated (incorporating Dublin Core into it) - Dan will be working on end of June through summer to customize Lifecycle usage
Start looking at output again
Jon has a project at the UKDA at writing some python classes out from COGS model
Conversion process form Lifecycle schema to COGS model we inherited from primitive data types - Dan will be adding a check to COGS to not allow that (example: CV inheriting from xs:string means we are missing string value in Json etc)
When making SDTL in COGS we were not dealing with the conversion process and so this is improving conversion rules
Other issues? See DDILIFE-3713
Incorporation of DC into model? Is it worthwhile incorporating an outside schema into
Include Olof on this topicDo we want to be pulling this into the xml serialization through our own storage in xml or to reuse
External dependency adds some extra work - this needs to work across different implimentatons
Maybe a meeting at IASSIST - find out who will be there via a poll on the SRG list
Are the other major areas we should be looking at? Christophe are the issues from Insee?
ATTENDEES: Jon, Oliver, Darren
REGRETS: Wendy, Flavio, Christophe
TC Funding submission
Submitted document at: Funding Request 2023-24
DDI-CDI progress
Using http for Document being finalised to come to TC.
Controlled Vocabularies
Duplicates issue has been raised on GitHub, still investigating where the issue was / why and how this is occurring. Other issue is lack of persistent identifiers being produced from CESSDA. Discussion with CESSDA to look at a plan going forward.
DDI-Life 3512 - for next weeks agenda
Website functionality - for next weeks agenda
Functionality we'd like to have to storing metadata on and providing search and access to examples, tools, profiles, and relationships to other standards
ATTENDEES: Wendy, Flavio, Darren, Genevieve, Christophe, Dan
REGRETS: Jon
Introduction of new member
Christophe Dziowski (Insee)
Areas of interest:
Lifecycle work
Planning to look at CDI for data description
Large consumer of DDI objects
Issues arise that they'd like to discuss
Other questions as they arise
CV resolution update
CESSDA output dup label/discussion language entries
Darren will be meeting with CESSDA next week
Put on 2024-25 workplan review of CV output, CESSDA relationship, options
Send a note to CV group
Postman collection for test RDF system:
DDI Alliance Controlled Vocabularies
Funding
Some questions were raised about cost representations
Face-to-Face and Administrative data group requests are priorities
ACTION: Wendy will complete and send out for any additional comments and then turn in on April 10
Face-to-face
Discussed content coverage
Long-term work infrastructure and DDI Suite management
Administrative data
Need to clarify a couple of points in cost estimate with Jon
COSMOS
Infrastructure support
Approved within group
Does not address infrastructure development work. Focuses on maintenance/bug issues. Any long term needs for standard maintenance or infrastructure (cloud space for example) should eventually become a standard budget item not a yearly request
DDILIFE-3512
Dan will provide example of how to handle use of UserAttributePair of identifiable parent object with multiple CodeValueTypes. Add to a future agenda for review
ATTENDEES: Wendy, Flavio, Jon, Darren, Oliver
Location of updated SKOS output for CVs (Wendy will add object structures to database of DDI products)
ddi-alliance/ddi-cv
Funding Discussion:
Wendy and Jon will pull together specifics of the identified funding areas
Funding Request 2023-24
DISCO discussion:
Disco Discussion
Next steps:
Help facilitate Daryl's use of Disco
Set up a discussion with him
Clean up html and content of disco pages - date on html page is wonky
Clean up broken links RDF XML and N-triples
DDI-RDF Discovery Vocabulary
CESSDA and CV update:
URI's in CESSDA output have been updated (not totally right - still human readable - but not a problem for us
Oliver has run a new refresh cycle so bitbucket repo contains 3 digit numbers (doesn't have same URI's)
Just added an extra field that allows going back from codelist to RDF
CESSDA is migrating from bitbucket to GitHub which will cause some issues
So source location will change
Should not effect our process as our bitbucket is under our control
During meetings in CESSDA after 28th April we can move on this then (based on Darren's availability)
The URI's generated on our side will remain constant over time
ATTENDEES: Wendy, Genevieve, Oliver, Dan
REGRETS: Flavio, Jon, Darren
Scientific Board Issues
TC reviewed the issues from the Scientific Board 2023 extension of their workplan. The goals was to identify what materials we needed to pull together to support further work by the Scientific Board. We also considered the timelines for these issues so that we can add them to our 2023 workplan. See the Google Doc in the TC Drafts folder:
Scientific Board Goals Requiring TC Input
DDICODE-101
We reviewed the additional comments from Taina and Sanda on this issue. See the Jira issue for additional discussion. We have proposed a solution which will be fleshed out in the next few days for review. Note that this solution attempts to balance the intent of concept and conceptualText, the needs noted by the filers, and the need for consistency over the versions of Codebook.
ATTENDEES: Wendy, Jon, Oliver, Dan, Darren
REGRETS: Genevieve, Flavio
Update on CDI work schedule - shared draft from CDI
DDICODEBOOK see issues for discussion, comment, and resolutions where decided
DDICODE-100
DDICODE-88
DDICODE-94
DDICODE-95
ATTENDEES: Wendy, Darren, Dan, Genevieve, Flavio
REGRETS: Jon
Codebook issues - some specific questions and also how you want to deal with non-response to questions we've asked filers
Updated resolutions to
DDICODE-92
DDICODE-96
DDICODE-91
LIFECYCLE Controlled Vocabulary alignment with Codebook
--Can we fast-track the addition of controlledVocabularyInstanceURN to ControlledVocabulary
--Write up issue and let TC know this is filed. Alternatively provide a consistent extension (KeyValuePair) approach until the next version when it will be added as XXX
Status of CV system - further discussions with CESSDA and our next steps in terms of reporting status and any action we want from SB
--Problem with URLs in SKOS
--They need to run a recreate task, after they remembered they regenerated
--Still saying NULL on the production system; test system is OK
--IDs in URI - are they going to be generating IDs and 3 digit versions (NO and YES) - Oliver handles IDs post production
--Miles will be leaving
--Darren will talk to Carsten about when this will be deployed
--Progress is slow
--Need to get them to publish the correct URIs
--Will be switching from Bitbucket to Github
--Darren is following up with Carsten about who is handling this when he leaves; can we expedite deployment
--We have to document this work so that there is a communication channel between CESSDA and TC as we are dependent upon that
--We both seem to be interested in formalizing this contact so we should take advantage of the timing
--Darren will continue to pursue
--Need to get informed about every deployment on production system in advance so we could test against our system
New issues with DDI Alliance web pages and future changes in Drupal and page options
--Loss of data base options for tools, examples, etc.
--write up what we do use and how those features work
--send to Oliver and Olof for recommendation
--Might want to do similar to what we did with CVs in bringing off-site to a content management system
ATTENDEES: Wendy, Genevieve, Dan
REGRETS: Jon, Flavio
Discussed Marketing Group role per request from Jared. Notes located at
ATTENDEES: Wendy, Jon, Darren, Oliver, Flavio
REGRETS: Genevieve
Clarification of TC check list for product review prior to publication
Start with the old document of what should be in a public review
Add info on license
Something done in August and earlier version sent ot Arofan
Bring that information together, clean it up, post it, inform product groups
ACTION:
Wendy will locate previous documents and discussions
Jon will pull this together into an informative document that can be published and shared with product groups
SB-12
What is the scope
Meta deliverable - there needs to be a general product lifecycle methodology
Are we going to include a testing methodology
Recommended:
Move to TC
Development work is done within a working group
Detailed thing makes no sense
Identifying things that are done centrally (licensing, maintenance
Clarify the script - management of requirements
How to deal with new requirements - high level is in procedures document, does this need more details or a means of tracking cross product expanded coverage
Capturing requirements - parallel implementation in other products - this needs to be fleshed out
ACTION: Darren will work on wording based on this discussion
Codebook issues needing comment from Katja and Sanda have gone out (5 issues)
Postman promotion of DDI Controlled Vocabularies and other content
DDI Alliance Controlled Vocabularies
and documentation is at DDI Controlled Vocabularies (CVs)Currently done under Darren's account with Pascal
Make this more a formal connection API
If this is a worthwhile activity then we could make this more formal
Oliver does not use but others do
Is their an intent to use GraphUL? Pascal is enthused to devote some time to this
We could develop tidier URL to this and promote more
This is a more developer friendly resource
It is a supplimental piece not a replacement
"Augmenting the value chain of DDI"
If it could be formalized and promoted it could be more relevence
This should have a designated TC contact (Darren)
This may not be a TC decision per se should make recommendation to SB
There is the sustainability issue
TO DO:
Follow up on a call with Pascal and then follow-up with the SB
Explain the value of adding this approach
ACTION: Darren will prepare something for the next SB
CV work:
The SKOS is full of null's for all of the URI
They have changed the way their REST service responses
Oliver needs to do a few edits to the process to handle all the null entries
ADP was supposed to come along and sort this out and its now way worse
TC issue here is that it stops our work - they changed the output
Is there a validation system for this -- NO
The URLs used to be valid within just one document
We may just need to take the CVs out of CESSDA
We would need to provide a replacement for the entry work
ACTION:
Darren will write up a report of the blockade to our work
Need to have some form of assuance from CESSDA that they are continuing in their committiment (to Bonnie, member rep) with Carsten leaving - written request from Hilde as Chair of SB, Darren will include this request in his report
ATTENDEES: Wendy, Jon, Dan, Darren, Oliver, Genevieve
EXCUSED: Flavio
DDI Alliance web pages (Products section)
Ideally we should be able to generate a landing page for CVs
Don't understand where we currently are on the Drupal
ACTION: Obtain information on version used and determine options
GOALS:
Simple navigation - very wordy (comments on current situation)
--Product page with 6 or seven different products (Overview page)
--Overview of Current Products is heavily wordy - need to review and modify
--Individual pages need to be revised - separate issue see action itemsThe normal approach for software or standards is to go to the current version
Deprecate the old page and redirect the generic URL to go to the current version
Landing on DDIalliance.org/Specification --> overview page
individual headers (sans) version should go to current version of the product
configuration of redirects to current versions - need to specify this information
The problem with the old specification page is that it is retrieved by true Google
Redirect to a location mentioning old page is deprecated and Google will learn new URL over a six month period
How does DDI manages old content? Is there a process?
ACTION: Create a set of redirects that we want
ACTION: Darren will send suggestion for product page structures
ACTION: Oliver will look into Drupal capabilities
DDICODE-101
Concept to provide a home for the label in multiple languages?
When the vocab stuff was added
Documentation needs to be added to
A number of these provide only the token or code which is single
Add an attribute for a code value/token
ContentType is an extension of Simple Text.
Several locations Content used directly.
ACTION: Get back to filers and find out exactly what they want to express and review solutions
DDICODE-95
ACTION: find out where are you using ExtLinks/Links
DDICODE-90/91
RESOLVED: Add access attribute specifically as noted IDREFS (sdaterefs is still available broadly)
ATTENDEES: Wendy, Dan, Jon, Oliver, Genevieve, Darren
DDICODE-101
Notes are on the issue. We discussed both the request, the desired usability based on the request, and software issues. Darren and Dan will go through carefully. This will probably need a meeting with Katja and Taina to resolve.
ATTENDEES: Wendy, Jon, Oliver, Genevieve
Went through SB draft of 2023 workplan and added comments in time for SB meeting on 2023-01-17
Reviewed number of Codebook issues that were outstanding. Resolving or identifying where more information was needed see Jira with filter Codebook-Outstanding
ATTENDEES: Wendy, Jon, Flavio, Dan, Genevieve, Oliver
Reviewed document of SB work plan making notes regarding updates and requests for more clarity
Review Jon's comments on google doc
Get to SB prior to Jan 17th meeting
Jon will draft the report form so we can respond and get in by due date to SB
COGS still has a couple tickets left
the main issue for XML validation: ticket 292 DC namespace issue
deciding how xml:lang should be mapped - move to xml:lang tag when its talking about the overall language of the content
In JSON we are treating this as a property
In RDF there is no xml:lang and so we need to address this
Could be done after conversion so we are not doing a lot of editing.
Dan will look at what needs to be done and when they can fit it in
CODEBOOK
Try and complete that in January.
ATTENDEES: Wendy, Flavio, Dan, Oliver
Reviewed the notes from the Integration Languages workshop
ACTIONS:
Post notes as confluence pages (done)
Draft content for database of DDI objects by product based on decisions on Common Objects page
Finalize the description of the RDF URL decisions from the meeting and verify with ICPSR on needed redirects
ATTENDEES: Wendy, Flavio, Oliver, Dan, Darren
NOTE: No meeting until December 8 due to US holiday and EDDI
AGENDA: Carry over from last week
See minutes from 2022-11-10 for RDF URI document
ACTION: Darren will make the changes noted and contact Achim and Pierre-Antoine
Recommended sub-agencies:
initials are consistent with past identification as DDI-L, DDI-C, and DDI-CDI
Lifecycle: int.ddi.l
Codebook: int.ddi.c
XKOS: int.ddi.xkos
SDTL: int.ddi.sdtl
Controlled Vocabularies: int.ddi.cv
CDI: int.ddi.cdi
ACTION: Wendy will write up and contact Jared and Darren about getting these filed. Product maintenance groups will be informed of the name of the sub-agency as filed.
Update of 2021-2022 TC work plan to cover 2023
TC-Workplan-2021-22rev2-2023extension
See comments on document
ATTENDEES: Wendy, Jon, Darren, Oliver
Test RDF-vocabulary:
Controlled Vocabularies - Overview Table of Latest Versions | Data Documentation Initiative
Production will be https://rdf.ddialliance.org/controlled-vocabularies
Scientific Board extsion of 2021-2022 plan to cover 2023
SB plan extension - send link and identify what can be updated for TC
Draft of URI construction
- TC_10_Nov_22_Proposal-for-common-RDF-namespace.docx
16 Nov 2022, 03:54 PM
use of rdf.ddialliance.org/[product]/[version]/
DDI Sub-agencies for products in DDI Agency Registry:
Should each product have a sub-agencies to file
Draft a list and verify the names to match the CV
Lifecycle: int.ddi.life or int.ddi.l
Codebook: int.ddi.code or int.ddi.c
XKOS: int.ddi.xkos
SDTL: int.ddi.sdtl
Controlled Vocabularies: int.ddi.cv
CDI: int.ddi.cdi
ATTENDEES: Wendy, Jon, Flavio, Oliver, Geneviève, Johan
Codebook issues from review [comments and notes are listed in Jira issue]
ATTENDEES: Wendy, Dan, Oliver, Flavio
[Darren provided a contribution via email]
Guidelines for contributions to Lifecycle - how that work with new COGS system and new available options for contribution
Current status of validation rules:
validate — COGS 1.0 documentation
Validation checks are made. The rules are there due to multiple bindings this checks to make model is valid for each implementation
BACKGROUND:
There are rules for adding modify/adding things in COGS. These are process rules and incorporate the rules defined in the validation system.
The guidelines under discussion are not COGS specific but contributor guidelines specifically for Lifecycle.
Audience would be not only COGS but also consider audience of content working groups. This could be a framework for contribution guidelines for other products, even those not using COGS. They would be useful for content working groups who are defining new content areas which may be incorporated into multiple products.
Guidelines should be provided for all products
TOPIC AREAS:
Identification:
What needs to be identified and what doesn't (easier to integrate across products and retain consistency)
Patterns:
Patterns - identify important patterns
Grouping models - provide patterns
Use of grouping mechanisms for identified items that may need packaging and later management
Describable type - name, label, description (describable type) Use of describable as base
Use of common structures:
Controlled Vocabulary usage
String structures
Documentation:
Standard documentation lines - markdown documentation for each item or item type, plus property level
Rules for content of documentation (should this be added to validation?)
Property documentation is limited in what is allowed (due to limitations of implementations)
Item type information is most useful the web documentation but is available in several serialization but generally in plain text
Guideline to keep it primarily text based - use the article page to provide graphic details
IMPLEMENTATION LANGUAGE SUPPORT:
Implementation characteristics and variance between implementation languages (leverage results of Implementation Language Workshop)
Whole notion of packaging - varies by implementation (RDF for example)
WORK PLAN:
Review the issues we've noted for modeling decisions, organize the guideline issues noted above
Review tickets regarding development/modeling guidelines - look at how these should work going
How new options would work. Different groups, new tool for content/product groups
Might need to go with a number of discussions with people to align with current content and modeling guidelines
Workplan for RDF registry system (CV and XKOS): [submitted via email - Darren]
wc 24-Oct
*Final tweaks by Myles/Oliver
*Send email out to CV WG and DDI Community asking for Feedbackwc 31 Oct
Out for feedback and public reviewwc 07-Nov
Out for feedback and public reviewwc 14-Nov
Final tweaks and finalise documentation
Should we have formal TC/SB sign-off?wc 21-Nov
*Launch accompanied by webinar or other comms?
*Announcement at EDDI
Next Week:
Organized work plan for guidelines work
Discussion outline for boundary issues
Define what is being discussed and any current issues out of Scientific Board
ATTENDEES: Wendy, Jon, Darren, Dan, Oliver, Flavio, Genevieve, George
Presentation on registry for Controlled Vocabularies and RDF Implementations
Controlled Vocabularies
The SKOS output from CESSDA does not have resolvable URLs
We pull those into bitbucket modify and import into our repository
CV manager --> Bitbucket repository --> Resolver
ADD to HTML page content:
Canonical URN
Canonical URL (RDF)
Canonical URL (XML)If there is a continued need for old format excel files recommendation is to create a transform script that can be used with SKOS file so it can be done only when needed by the user
Need CV group to provide content for the overview/index page
Darren will contact
Postman
[displayed content in Postman noting the content coverage for an object]
Example
Concept - missing accept (HTML); HTML, Turtle, RDF+xml, N-Triples, JASON-LD
CV production process:
Currently manually triggered
Going into a automated structure - do a periodic dump (can set up as needed)
Commit based on if there is change
Reinitializing being triggered by a commit
CV group would mark as published - the rest would happen automatically
Test Group:
Carsten
Franck
CV group
GESIS group (Oliver will contact)
TC
Inclusion of Disco in registry:
Disco - implementation based on Codebook/Lifecycle model discovery content
Never published
No one has taken ownership and done work on it in the past 5-7 years or more
If it were published it would go in this registry as an RDF product of the DDI Alliance
URL namespaces
DDI-CDI turtle needs to be delivered through system
Agency names for products:
Use of sub agencies for different product agency names
Look at comparison of liked named objects in terms of differences between specification
Next steps:
Tune up internal links
Complete changes as noted above
Set up the link on agency registry (Dan will check if Tech person can make changes; Darren will contact Jared as administrator of int.ddi.cv)
Meeting with CV group to get overview/index page
Contact Franck and Pierre-Antoine regarding namespace
Testing can begin around the 24th
Create Documentation - Oliver and Darren (next week start)
Webinar on CV's and how to use them; XKOS also
Coming soon from Scientific Board
Review expanded Scientific Plan -
Add emphasis on high level integration model
Add technical development platforms
Consolidate support task for individual product groups and content groups to a single item covering multiple groups
ATTENDEES: Wendy, Jon, Flavio, Oliver, Dan, Genevieve,
Regrets: Darren
Implementation Language Meeting
Process:
Do requirements gathering on Friday
could be very broad
Make sense of this over the weekend - turn into a coherent agenda for Monday and Tuesday
Guide for review on the weekend (Wendy and Flavio will look at this...keep general)
alignment across models wherever it makes sense
review on basis of requirements
nothing gets dropped - just tabled for future discussion
Talk about that on Monday and Tuesday
Known topics
Priority languages
How that could be done in terms of alignment
what can be done or can't be done
Send out to users list next week about the Friday meeting (Jon)
Open a document for people to put them in (queries, comments)
Send this document to participants
Get a note on the EDDI web site (Jon)
FUTURE AGENDAS:
COGS was discussed last week
13 October
Darren - Test platform for XKOS and CV RDF resolution
Second half of October
Boundry issues for TC and production guidelines etc. (second half of October meetings)
Contribution guidelines for lifecycle - how that work with new COGS system and new available options for contribution
Beginning November
Codebook review - send out reminder (Wendy)
CDI submission to TC
ATTENDEES: Wendy, Jon, Oliver, Flavio, Darren, Dan
Apologies: Genevieve
Implementation Language meeting in December 2022
Friday - 2 rooms so up to 30
Thursday - TC meeting and people can
Jon - will check to see on room limitations
Clarified the role of the Friday meeting and goals of Monday/Tuesday meetings
Provided criteria for Monday/Tuesday meeting involvement
Jon will draft a more coherent document to share with CDI
Preparations for Lifecycle 4.0 Beta
Identified places where documentation needs changing in through COGS - WLT will provide content to Jon, Jon will make changes
Dan clarified what changes needed to be made in the way DCTerms are handled, name conflict, xml:lang, one or two other minor point. These may not be completed until November - Dan will keep us updated so that outputs can be reviewed
Wendy will file issues for future discussion on Lifecycle
Jon will review statis of Topics list and add to agenda in the next few weeks
Codebook
Review has generated a couple of comments which have been addressed
ATTENDEES: Wendy, Jon, Darren, Oliver, Dan, Flavio, Genevieve
AGENDA and notes:
Welcome to new member
Genevieve Michaud from SciencesPo has jointed the Technical Committee as of this meeting
Bitbucket access issues
There seem to be a range of access issues going on in terms of uploading content
Jon will explore and get back to Wendy on this - others will be contacts if there are still issues
Provide a brief document on addressing problems that can be posted to help others in future situations
Working Group Relationships
PDF diagram for comment
TC drafts - Google DriveComment that there needed to be some indication of communications lines particularly between the two sides of the diagram
working group contact with Scientific Board
points where there needed to be good communications between specific groups (e.g. TC and Training or Marketing)
Process Papers
Technical paper review
Review Process for Official Technical Documents
Product process
Development and Review Processes for DDI Standards and Official Technical Documents
Plan for completion:
Jon will assist
Combine the 2 documents
Add short descriptions of the role of TC, Board, Community
Add how new areas are identified and working groups creaated (ref SB procedures)
What happens to work that stops getting worked on or is determined to no longer relevent
Minimal needed
--Working groups need to have notes on what they do and when they meeting
--Google docs should be downloaded and attached for a record of work
--decisions recorded
--issues addressed/discussed
FUTURE AGENDA ITEMS:
2023 continuation of Scientific Board Working Plan
--If there are new areas we are working on be sure to bring up when we get something from the Scientific Board to respond to2024-2025
--Consider what should go in this (probably will not work on in Scientific Board until spring 2023)
ATTENDEES: Wendy, Darren, Oliver
excused: Flavio
Codebook version 2.6 public review
package has been posted on review site
sent preview link to several very active group members (Taina, Darren, Julian, Mehmood)
last thing to complete is the high level document draft to be completed tomorrow
Send Jared text for review announcement Friday
Jared will send out to the usual suspects on Monday, Julian will distribute to additional groups (I'll ask for which lists were used)
Task list from August meeting
ACTION: Wendy
Finish section on TC August meeting report and request comments and additional detail for follow-up task list as needed
Scientific Board meeting
The following items are points the TC needs to focus on in the next few months:
Approach to cross-DDI relationship
conceptual mapping
transformation mapping at different levels - CDI has done some research on ways to express this
Learning pages
Resource page
relationship to other standards
Getting started page - could use some ideas for re-do
Scientific workplan schedule
Continuing resolution extending 2021-2022 to end of 2023
Two year plans starting with 2024-2025 (calendar year)
Planning starts in 2023 for the preparation of a workplan to send out to scientific community for comments and approval in June (six months prior to start of plan)
Process papers on publishing products and official technical papers
merge the two documents
add process flow diagram
Add information on maintenance (how long do you continue to fix bugs on old versions)
Creation and maintenance of a product list including all versions
Include full process: how development projects for a product are dropped, tabled, resurrect, in addition to publishingInclude similar process to developing and implementing new or improved content - this could tie into the previously discussed tracking of requirements and tie it into the requirements of creating working groups, tie in the new options for use of COGS
Important to get work going on development and publication requirements so we can get it into the document and shape this discussion in the broader SB
Goal is to prepare a full process document covering the initiation, development, publication, decision points (start, table, end), and maintenance of products (specifications), technical documents, and expanded coverage topics the the DDI suite of products
URN and Http resolution - there were suggestions for improving the presentation of instructions and doing marketing and training - they will get back to us with detailed suggestions
ACTION ITEM: (Darren)
Produce a short proposal consolidating URLs and URIs for DDI specifications (XML, RDF, XMI, JSON, C+)
What we currently do:
https://ddialliance.org/Specification/XKOS/1.0/OWL/xkos.html
https://rdf-vocabulary.ddialliance.org/xkos.html
ATTENDEES: Wendy, Jon, Darren, Flavio
DDI-CDI Bitbucket proposal
Reviewed and have no problems with this. Resulted in a review of current Bitbucket layout and the need to clean up and clarify repository of published packages.
meta project landing page
Clean up bitbucket space
have repository of all version packages
DDI Atlassian Member Management
Reviewed draft and confirmed goal of encouraging use of primary tools and management areas for capturing the work of the Alliance. We added a few points on follow-up work, particularly in working with the workshop coordinators to meet current and future needs within licensing constraints
TC meeting follow-up
Discussed set up of a full meeting page with links to workplan, notes, and meeting report. Will contain executive summary and list of tasks generated in meeting.
ATTENDEES: Wendy, Jon, Jeremy, Darren, Flavio
1-Notified TC on the content of the ISI IPS proposal
25 years of DDI working with statistical agencies
Enthusiastic response from invited speakers/agencies
2-Implementation Meeting (around EDDI)
Practical perspective: Jon will talk to Alaina about space
Send outline of meeting to Arofan
3-Codebook: update on preparations
Send to Jared for review announcement
Finish CV and Geography sections of high-level documents
4-Review of workplan for TC August meeting
Workplan for August 2022 TC Meeting
Ambitious, but nothing missing
Specifics will arise during discussions
Good to bring in the overall process workflows and tooling information
ATTENDEES: Wendy, Dan, Jon, Darren, Flavio
Implementation Language meeting around EDDI
Funding for Fall meeting on Implementation Languages has been approved. We need to develop a more fully developed outline for this meeting to discuss with CDI later in July.
Week after would be better due to room availability
EDDI Right after Thanksgiving (Wednesday-Thursday)
Hackathon (check with Ingo)
More generic meeting on the Friday for broader contribution and gory details the next week (3 days?)
Identify specific implementations
What needs to be done in a consistent manner
What can be flexible
Explanation of usage needed
Codebook 2.6 review
wlt provide Jon with link - upload
COGS
No Jon next two weeks - no actions on COGS until then
August meeting of TC in Minneapolis
Funding has been approved
Specific goals and outputs from August meeting:
--Document changes in structure
--XML, RDF, JSON, UML/XMI (canonical), Sphinx and restructured text documentation
--Rules and output where possible - reconciling between the schema and outputs
--when we cut the cord that we are able to get back to the 3.3 schema
--What the different serializations look like - just doing vanilla transformations are a bit pointless
--Technical review - can roll out serializations separately
--XML and JSON are currently the most used and most stable and are probably ready first for review
--RDF and UML/XMI has been tested by EA
--If we wanted canonical XMI we have to change output or use the translation tool used by CDI
--Build in canonical XMI as an output (currently has 2 - Normative 2.4.2, 2.5 with diagram and diagram exchange) new flavor would be canonical
--Discussion of TC future leadership and roles
--Change log of XML structure between 3.3 and 3.4 to see what changed and why (supporting multiple serializations)
--Talking about changes - tree hierarchy is still the same but there are other things that are being updated to XML centric (CHOICE models) - substitution groups, removing a few things that are remnants of 3.0 and 3.1 (Identification and Reference properties)
CANCEL Next two meetings - June 30, July 7
ATTENDEES: Wendy, Jon, Dan S., Flavio, Darren
AGENDA:
--Respond to SB comments on Review Process for Official Technical Documents
Document was revised in response to comments from the Scientific Board and Franck Cotton
Recommended that this document be integrated into the Process Document for standards publication. Hilde and Ingo will be notified of update
--ID and Reference for COGS DDILIFE-3703: Review of properties from ReferenceType for use in 4.0OPEN
Some things have been injected during serialization. Should these be flagged in some way rather than have them just in code
The model definition itself doesn't need any knowledge so perhaps in the settings area rather than just in the code. Clearly shouldn't be in the model
This would be useful for documentation at the serialization level.
See issue Number DDILIFE 3703
--Publication process for CDI - new date is mid-July
--submission
--parts delivered
--difference on contents to determine if additional review is needed
--clear statement of what members are voting on as this is a new product for the DDI suite
Place on future agenda - Please think about this so we can move through the process expeditiously
--DDI Codebook comparison to other products - work done by SND - how to integrate this into DDI work on comparison SSHOC DDI Codebook Metadata Crosswalk
Place on future agenda
ATTENDEES: Wendy, Oliver, Dan S
Updated on CDI status, Codebook review status, COGS input corrections, and LOD work at UKDA
Discussed August in-person meeting including increase in air fares and goals for the meeting
Focus is on output of COGS
Produce implementations that can be used for a broad technical review of new XML structures and new implementations - this is a technical review of the viability of these new structures and identifying issues that should be addressed.
Discussion of TC direction over the next few years - new members, leadership changes, strategic directions
ATTENDEES: Wendy, Flavio, Darren
In person meeting
If funded by Exec Committee the TC In person meeting in Minneapolis at ISDRI has been approved for August 1-5, 2022. A note has been sent to TC members confirming the dates.
Presentation at Members Meeting and Scientific Board
The Members meeting presentation is only 3-5 minutes in length. Focus will be on funded activities and Codebook work. Will get a short statement from Darren so that this work presented accurately. Next year focus of coordinating multiple implementations of products and progressing on the production platform.
COGS decisions regarding Identification and Representation properties, use of citation
Reviewed issues and had a brief discussion of properties
Wendy will create Jira issues (relate to gitHub issues) so that a full discussion is recorded and retrained within the design issues discussed in TC. (GitHub is specific to the COGS implementation and this should reflect both the issue and decision). Google documents will be used to focus discussion and linked from the Jira issue (these will be downloaded as documents and attached when issue is decided).
Members will be asked to review and comment
COGS input issues (xml to csv) will be addressed as follows:
Modify transformation program where easy to do (new columns on csv with default values where information is not available in xml, etc.)
Export resulting csv files, edit, and reload - this is a one time import and the issues are primarily due to inconsistencies in the xml
Focus on import transformations should be on scripts that will be reused (example: import of canonical xmi from CDI to create a COGS copy)
XML will be modified for addition of managed representations for those few not currently available and inclusion of other physical data product content using name modifications for elements with same name in different namespaces. The intent is to support the shift to serialization and use of inclusion by reference and to provide all the record layout options for review without massive remodeling.
Handful of complex choice nesting locations will be mildly remodeled and documented in csv files.
ATTENDEES: Wendy, Jon, Darren, Oliver, Dan S.
Annual Report
No additional comments, will leave available through Sunday and then send in
Codebook
Field level documentation HTML version has been completed
Wendy will review Change Log and update, set up review page and create the google page for the high level documentation draft
Review should start by end of May. This should last to the middle or end of August
Anticipate easy review because there has been a lot of consultation - this should mean that developers can begin work using the Review version we some certainty
COGS work
Reference:
Reference properties outside of the identification is the issue
Reference is an entity and will act differently with serialization - RDF is a URI, XML / JSON model
Review specifics of properties - Wendy will document
Example: Where you need to know a context of what you are referencing add another reference for context.
Including extra properties
Option is to create Complex types
Reference becomes an element in the complex
Where is such extended content needed
Identifiers:
don't think there is any difficulty by serialization. Some have their own identification (URI in RDF)
Drop the difference between Versionable and Maintainable. There is only one way to identify an item (URN, AGENCY, ID, VERSION)
Additional properties
Describable class extension of versionable
Properties found in versionable - review
Identification is injected during creation of serializations
Inject identfication into any item type will have identification
Complex data types DON'T get Identification
Another area of confusion for additional properties is the use of Name/Type/Description and Citation (currently used singularly or together in some elements)
Look at where citation is used and how. OtherMaterialTypes (clean-up later) Group? StudyUnit? ResourcePackage, Instance, etc.
CHOICE issues are being sorted
PhysicalDataProduct
Add NCubeInstance to the basic RecordLayout now in the main schema
Make the others derivatives of the BaseRecordLayoutType with unique element names and move the content to PhysicalDataProduct
This will get everything in AND let us use one PhysicalDataProduct to import
Representations
Recreate the few representation types that are not in the ManagedRepresentations.
This gives them the same structure and we can determine later if we want to continue this division between representations that have references to other objects or what.
CV and RDF resolver:
The sandbox will be set up end of next week
https://rdf-vocabulary.ddialliance.org/cv
e.g. https://rdftest-vocabulary.ddialliance.org/cv/ModeOfCollection/1.0.1
ATTENDEES: Wendy, Jon, Darren, Oliver
CV work
UKDS are finalising server setup next week.
Oliver & Darren working together to get content in once it is setup. Target by the end of May.
DNS needs to be setup at ICPSR.
DDItoCogs issues from Virtual Meeting
Physical Structures (when to do remodel - couple of options for testing COGS input)
Representations (when to do the remodel from mix of in-line and reference)
Reference structures
Identification
Standing on the Agenda
TC will still have a shorter separate presentation (WG updates will be summarized by SB chairs)
Report for the Scientific Board Meeting
Written report to the meeting
Draft has been started - Wendy will complete and circulate
Date for F2F Technical Meeting
Canvass for dates – August 2022
Codebook Review
Field Level Documentation – Jon to do
Documentation will be description of changes
Full High Level documentation to follow
ATTENDEES: Wendy, Jon, Darren, Dan S.
Update on CDI
--CDI is planning to get the package to the TC on June 1st. They realize we won't be able to deal with it until after IASSIST, but given the history of delays over the years, this lets them put a "win" in their column by getting it passed on.
Virtual Meeting:
Updated the page with correct dates and links to the relevant issues in TC
9-11 CDT M-F plan on meetings. These may not be used but keep the slot free
Topics to discuss:
Review where we are at with the input transformation
Complex choice - status and how to redesign where need - chat first thing and then what we need to discuss later
Substitution Groups:
Chose specific substitutions to use and how to implement
XML substitution groups - is there stuff we want to merge or can we remodel that
There is no model for the DCterms terms, these have been replaced by primitive types
Reference
Design rules (managed references rather than modeled)
Things that derived from Reference need to be added in other ways
Handling additional reference information in terms of the overall model as it has to deal with multiple serialization targets
References - is there some content that we need to retain that is not there
Rational to document:
This will simplify the structure for everyone - in the future
Draft of what was changed
References were converted from a type to an auto-generated
Organization:
Monday
Discussion on above
Sort out who is doing what and when we get back together to discuss each point
COGS validation process - what can be done intentionally and what needs to be done
Tuesday
Discussion about what we are trying to do with the UML XMI input and output
Realistically can the CDI model be collapsed into a COGS model
Thursday
Can we have some consistency in the UML-XMI expressed by the canonical XMI
What are the realistic possibilities of having a COGS maintenance pipeline for CDI
Prioritization of outputs from COGS
Actions:
Wendy will draft content on changes in the DDI Agency Registry for the DDI Alliance page, have Dan S. review it, post, and add information to the Scientific Board URN document
Jon will add information from today’s meeting on the Virtual Meeting page including a draft meeting schedule
ATTENDEES: Wendy, Flavio, Darren
Funding requests
Draft of the joint sponsored meeting on implementation languages sent to Arofan for CDI input
TC meeting is just an update of last years with an adjustment of goals and outcomes
Codebook 2.6
Finalize specification (header update) and make available to Darren (for CESSDA work)
Complete draft of high level documentation by EOM
Get out for public review as soon as possible in May
Publicizing DDI Agency Registry work
Draft content for web site in easy to understand words regarding capabilities and options
TC priorities for next year:
Implementation languages - do not mean to hold up work of specific groups but want them to be aware of the work so we don't design ourselves into corners or have decisions driven by single products
2.6 completion and publication
Requirements management
Production framework
Mapping within DDI Suite - translation work
Explore options for Codebook in terms of design rules (what needs to be backward compatible)
Syntax representations issues - maybe put new ones out is a beta-like mode for testing and identification of problem points - both Lifecycle and CDI plan RDF representations in the first half of fiscal year
No meeting
ATTENDEES: Wendy, Oliver, Darren, Flavio
DDI Agency Registry publication/announcement
Sent email to Dan and Barry regarding this
Would like to complete in the next 3 weeks
Identifying technical contacts in DDI
Notes from Jared's email
Next steps:
At its next meeting, the Scientific Board will review and finalize the draft definitions of Scientific Representative and Technical Contact
note: from SB transition working group-content for Technical Contact was originally written by TC
Jared will contact the member representatives requesting names for Scientific Representative and Technical Contacts. Both are optional.
We still need to figure out the optimal way of sharing those names.
Preparing for May Virtual meeting - what we need to complete in April
SEE: TC Virtual Work Meeting - 2-6 May 2022
(linked from TC page upper left box of current activities)
Items below were noted today and are on the page
UML/XMI review of what needs to be capture
Updates to the ingest program of known easy to fix problems
Outline the discussion issues around complex choices
Identification/Reference details
DDI Alliance resolution system for CVs and RDF vocabularies
DDI Controlled Vocabulary for Aggregation Method
Darren talked to Miles and he feels its very straight forward beginning work on infrastructure the week of the 11th of April.
The software PUBBY they were going to use is quite old - wouldn't really need if we use the CESSDA generated HTML
https://github.com/cygri/pubby/tree/master/src/mainConnect your Github accountDo we need a user interface for this - for general publication of CVs
This can be done with html, skos, and codeList
What would the user see if they followed a link. They would get the RDF or JSON structure. But the SKOS could contain the link to the html
HTML as a mime type could redirect to the visual pageThe transforms occur in a bitbucket pipeline using XSL transform. This would run with free of charge run time (say once a month)
Technical Design Specification
Added link from TC page CURRENT ACTIVITIES box and from the original Resolution for DDI-CVs and RDF vocabularies
ATTENDEES: Wendy, Jon, Dan S., Flavio, Oliver, Johan, Daren
Agenda Items:
Worked on Funding requests preparing them one for discussion with CDI (this was recorded as changes in the draft documents found in TC Draft shared google folder.
Did a final review of the recommendations for technical document review. This can be sent to Ingo and Hilde for discussion in SB
Members were asked to look at DDICODE-85 regarding a new Codebook Tree document and add comments as needed.
Discussion of the Recommendation for Review of Technical Documents brought up some broader issues noted below. Some of these are broader than the purview of TC so they will be raised in SB or EB as appropriate.
Discussion of types of documentation, technical interoperability, semantic comparability, use case focus
Organizing information can be difficult to meet the needs of different people
Need to be clearer about what can be found where
Technical interoperability
Semantic interoperability
Clearer publication of what is covered and perspective - on product page
How to describe the decision making approach in using different features (CVs, Variable Cascade, etc.)
What is the line between "official" publications of a product and what are supplementary materials (like the training materials)
Guidance documents by users of DDI
ATTENDEES: Wendy, Jon, Dan S., Flavio, Oliver
Reviewed current work plan through December 2022
CV work - Oliver has been working on translation of output. Needs additional permissions at CESSDA.
Identified focus for July-Dec 2022
Identified activities requiring funding
May virtual meeting on COGS work:
Outstanding issues:
Substitution Groups - Physical
Complex choice - dealing with inheritance hierarchy, comprehensive modeling of current information
Consistent modeling across model -
What can we do pre-meeting
Listing of things that were skipped or changed
Documents recently discussed in TC draft google folder:
WorkPlan for July Dec 2022
Review Process for Official Technical Documents
Integrated production framework
Requirements and Process Management
TC Work Plan 2021-22
No meeting
ATTENDEES: Wendy, Jon, Larry, Oliver, Dan S., Darren, Johan
CV documentation in codebook
CV group has requested that there be a link from within the schema to the specific CV
This means how we change for mid-release CV updates
DECISION: There will be a URI to the latest version that's what would go into the documentation
We can update only doing a minor update.
If documentation of new versions of Lifecycle are being generated we can easily integrated into the field level documentation
Is the SKOS a near thing in terms of exporting the SKOS we need to process. Oliver needs to look at it.
DOI's for Best Practices and High Level Documentation
Would prefer DOIs to on the HTML version of the Best Practices, or High Level Documents
Citing a standard you reference the specific home page URL
Library Guides: APA 7th Referencing: Standards & Patents cite standard number and publisher
Issue is do we want to maintain the versioning or take snapshots and deposit it ZENODO
Either we manage the versioning locally or we have to do the snapshot and copy to ZENODO
Step this back to the question of what is the need for a separate DOI for these official publications related to a product. We need a clearer sense of the use case for this as well as the broader need. Doing DOIs requires that we managing versioning more than an internal log of update changes within the document.
If we can't cite a document without a DOI then we need a DOI. File and management persistent identifier versioning situation
ACTION:
Raise a TC ticket and write down what we understand to be the problem
One point about persistence if we make changes in reorganization of our website we need to recognize the problem of breaking links.
Nail down what needs to be done for the documentation of the changes to the DDI Agency Registry both on the site and on the DDI Alliance page, then the announcement.
Mixture of a user guide and best practices document. Item for the Virtual TC - spend about an hour on the agenda. Coordinate with the SB URN working group
CANCEL Meeting on 10th March
ATTENDEES: Wendy, Jon, Larry, Oliver, Dan S., Darren, Johan, Carsten (2nd half for URN discussion)
Codebook:
Reviewed changes to agent types, conceptualTextTypes and concepts - spreadsheet should be implemented as listed
Enter these as a set for review, then enter documentation changes to 68 items. Note that 5 items need examples.
Mapping:
Johan - Project Shoc (social sciences and humanities open cloud) to create mapping to other standards and policies for conversion. There is a month and a half to the end of the project. Present to work to TC next week. Contact Flavio who will not be able to attend.
Recommendations part:
The first choice - is that registering every URN that is created?
Resolving DNS service are just for service resolution an end-point in a port (has been there for a long time)
HTTP service end-points (3 built in web resolution, DDI which is the individual, and set with all related URNs)
No differentiation between global and internal
Carsten:
What services should be operated to support URN resolution
What should be provided by the DDI Alliance
What should be provided by the member agencies
Diagram of URN resolution works (diagram in the SB WG paper)
Need to clarify what is already available
Once you know a URN belongs to DDI you can query DDI Agency Registry
If there is an established resolver used by agency
A service record is only providing an end-point (service and port)
There is nothing saying what type of service is being provided
You'd have to know the common name for the web/URN/verbatim service record and could use the DNS records to look up where those resolves are
Looking of service look-ups is not well supported across the board like by web services
How to come from DDI to the actual object
"N to T" arch id, handles, etc. at California Digital Libraries
This can do "N to T"
Have to know what the
No common resolution but there is a way of figuring out who owns the URN space and linking to the Home Page - DDI Agency Registry
There is a common system https://registry.ddialiance.org
You need to know the syntax of the DDI URN to proceed to go from ARPA to DDI (the registration in ARPA does that)
Don't want implementers to use that API to resolve everything, but to use that as a means of structuring local tool so that
A load limitation precludes having this available for handling all queries
So what's missing?
Identifier resolution and identifier end point. Individual identifiers can be resolved in terms of a consistent pattern within an agency or sub-agency. Individual variations from the pattern are not supported.
Published content owned by the DDI Alliance (primarily CV's and RDF vocabularies) - Darren and Oliver are completing that
General Discussion
Access to the objects is in the court of the Agency
In the recommendation (1) suggesting that the Alliance would register individual URNs (long paper)?
If a DDI URN is found in the wild the DDI Alliance can direct you to where it is found
That you can get a formal description of serviceIndividual URNs were not considered an option
There is a template approach (end-point at an agency level) they can provide additional breakdown at the initial endpoint (for example based on knowledge of internal structures)
What the current service does not provide is validation of agency assigned IDs. It is their responsibility to state it is not a valid URN. Even with handle service you have to keep up-to-date with directions
Use case not covered - is the Alliance taking on what is actually the agencies responsibility. The Home Page - DDI Agency Registry is simply a passthrough.
Is it clear to the agency that this is what they'd have to do.
What about a scenario where someone registers an agency and then deposits it in an archive.
They'd keep their same agency ID and can define their own resolution. You could put the archive as the service for their agency. The resolution service would have to direct it to a differential agencies.
Best Practice around this? If it is a distributed resolution environment and an agency has deposited in several different archives. They need a resolution system to address that.
TC is open to further discussions with Carsten or with the URN Working Group as needed.
Background Information:
Carsten's email:
My understanding when taking on the lead of the URN TWG of the SB was that only the first step, Consumer<->DNS was working.
I understood our mission to clarify which of the other exchanges in the diagram needed to be implemented by either the Alliance or an Agency.
This is what our discussions on centralised vs decentralised solutions were mostly about.
From Dan's slides from EDDI (https://doi.org/10.5281/zenodo.5747652) I understood as follows:
Slide 8 says that the second step consumer<->registry for SRV records has been available for some time, but I don't know how it works and can't seem to find the documentation.
Slide 11-1&2 allowed me to do the third step consumer<->registry/Agency for the URL of a concrete URN, but Slide 11-3 did only allow me to retrieve the pattern via API, not the specific answer for a given URN.
The final step consumer<->repository is beyond the scope of the URN service.
Did I understand this correctly and where can I find a full documentation of steps two and three to follow end to end?
If I understand correctly, there is no mechanism foreseen to decide whether any URN is valid by the registry, but it would always return a URL that the repository then can't answer. (see screenshot)
I am happy to discuss this also tomorrow afternoon and please do forward this mail to those whom it concerns.
ATTENDEES: Wendy, Larry, Dan S., Oliver, Flavio, Darren
XKOS best practices:
XKOS will have an invited technical review starting Feb 25, followed by a public review for comprehension lasting 6-8 weeks
Publication will have a DOI (working with Jared on process)
Wendy will draft a Guidelines for officially published documents (Best Practices, high level documentation) for submission to the Scientific Board for comment. This will be based on the XKOS process as the first document published under this process.
Codebook:
DDICODE-83 and general policy for use of conceptTextType
Result of this is that ProcStat will not change as the only real use is internal processing information and those with processing systems have not requested this. There are no broader standards.
Other items in document were approved.
The cutoff for deciding if an element should change from a simpleTextType to a conceptualtextType:
If there is a standardized list it should be conceptualTextType
If it is ONLY for processing (internal information) is probably overload
DDI Agency Registry:
Link to presentation now available. Contact Barry about marketing announcement. Dan S. will work on working with Barry.
Wendy will get these new capabilities into the URN Working Group paper to make sure they are complete
Wendy will review directions in Administration section of DDI Agency Registry and write up an information page for the the Agency Registry page on the DDI site. Content will be sent to Dan for review when ready.
ATTENDEES: Wendy, Jon, Oliver, Flavio, Larry, Darren, Dan S.
Update on Codebook activities:
Review of field level documentation has identified one or two additional needed changes (for consistency). These will be grouped and entered for review as a set.
Updated work on high level documentation - addition of a "tree" representation of Codebook
Follow-up on TC review of URN Working Group paper:
Timing of review by TC will be discussed in next URN WG meeting
Wendy will update the URN paper with examples from Colectica Presentation
DDI Registry presentation by Dan 10.5281/zenodo.5747653
DDI Agency Registry Upgrades
CV LOD system
SKOS recommendation
There are also documents on the Atlassian site. These will be merged and TC members informed that it is ready for review and comments
Versioning approach is being worked out and entered
Transform of existing SKOS into a cleaner SKOS (Oliver)
Publish transforms into bitbucket and then those will be grabbed and posted to the cloud
2nd week in march is slated for setup of the cloud platform
New CV tools also have been proposed to be deployed in March
Oliver has been in contact with CESSDA so that he is aware of where to do tweeks in URIs
to deal with identifiers for codes
COGS
After the virtual meeting regarding COGS input we will be focusing on COGS output with a goal of identifying where similarity in formats for XML, RDF, JSON-LD, other formats is useful between products and where differences are needed to support specific applications. Planning needs to start for funding and scientific plan in conjunction with the groups developing products. For example, Pierre Antoin has a relationship with Scientific Board in terms of development of RDF JSON-LD for DDI-CDI and RDF in general.
This discussion is broader than TC and work has been done in each of the products with one or more formats that needs to be considered:
DDI-L - XML (moving to full serialization), plans for RDF, JSON-LD, UML, etc.
DDI-C - XML (hierarchical structure)
XKOS - RDF (SKOS extension)
SDTL - JSON
Controlled Vocabularies - RDF (SKOS), XML (Codelist)
DDI-CDI - XML (serialized), plans for RDF
Google groups and role of SRG list
Update on Google Groups and continuation of the role of the SRG list to serve as a means of keeping interested people in touch with development discussions.
Dan had question on specific feature of set up regarding prefixes. He will contact Jared directly
ATTENDEES: Wendy, Oliver, Larry, Flavio, Dan S., Barry
Codebook -
Language by element repetition
pull-requests merged
Content and review of high level documentation - use review to edit draft
LOD Infrastructure
Met with staff and what do with the URI's released by the CESSDA service
CESSDA is doing a relaunch in March with minor changes which would cover all the system changes we needed for URI scheme
Bitbucket repository will be set up soon so Oliver can pass over the scripts for production
We may not need an extra transformation for target SKOS
2 major fixes are URI's to be really resolvable and not deliver empty descriptions for catagories if they don't occur in different languages
URIs are generated with CESSDA output and we would be able to define the URIs produced
Categories would have a consistent technical identifier rather than a clear text identifier
SKOS to Codelist - Oliver will see if Darren has done any work on this, if not he will do this within the next week or so
Virtual Meeting
April meeting - Dan and Jeremy busy the week of the 4th (6-11 unavailable)
XMI/CDI:
Cardinality of both source and target
Relationship types - aggregations, compositions may not be relevant
Uses and dependencies (patterns)
Types are in the names now
Achim has done drafts of description of UML that can be used for this style of modeling
We would want options for getting this into COGS
Review of CDI XMI and of Achim's draft - Flavio and Larry
Nail down dates as soon as possible so we can get some time frames for pre-meeting work
The mix of data and metadata is important - JSON is a good vehicle
XKOS Best Practice Review
Process (review of process itself)
DOI application - multiple DOI's? ex. INSEE
Scientific Board WG on URN -
Beneficial to make sure it addresses what is currently available and what is still needed
It can resolve you to an individual URN - they can define a range of services
Content hosting is not supported - URN resolution is solved
TC should provide feedback on the draft of the document as this is the group that has been working on it for the past decade
There are a lot of resolution options already available and we'd like to make sure they are focusing on new
ATTENDEES: Wendy, Dan S., Larry, Johan, Darren, Flavio
CV work
Discussion of low level specifics took place this week. Oliver expects transformations complete in a few weeks and Darren should complete some of his work. Something testable by end of February. CESSDA SKOS compliance probably not until end of year. We are not dependent upon CESSDA compliance to complete our work.
CODEBOOK
Get a 2.6 out for review
Work on weighting is not ready yet and should not hold up 2.6 given the need for DataVerse related. Better to get it correct than to put it in not quite finished. Backward compatibility rule. Don't hold but encourage movement on the sections that have been dropped (SDI and NCube).
DDICODE-39 is another one to hold for further work. Note these issues in the release for review information for 2.6 and add a label post2.6 to these issues.
Larry is looking into DDICODE-39 to see if there is a clear and unambiguous solution that could go into 2.6. If not it will be worked on for 2.7
Set up high level document site for Codebook with Jon.
CDI is still finalizing documentation and will get to use dependent upon peoples work availability.
ADDITIONAL ISSUES we need to organize for discussion:
XML is currently expressed in different styles between Codebook, Lifecycle, and CDI
what are the implications?
Are there overall rules that can be carried across the broader set of products and not just designed along a single one
RDF usage is currently in SDTL and XKOS
These have their own RDF schemas and SDTL is integrated in PROV-1 and XKOS is integrated with SKOS as it is an extension
Implementation languages for representations
Discuss when work plans need to be in for SB and how this coordinates with the financial requests.
NEXT WEEK NO TC due to Scientific Community Meeting
ATTENDEES: Wendy, Larry, Oliver, Dan S., Darren, Flavio, Johan
Requirements and Process Management - Integrated production framework and platform
Requirements and Process Management
Good idea to have a sense of an integrated platform
Infrastructure documentation - designing a standard SDLT
We need to have a person who will oversee the design and implementation of the infrastructure
Inventory of the existing buckets
There shouldn't be any piece of production process or work we do that isn't in a public space
Shouldn't rely on code or procedures that are in someone's head rather than in some published location
We don't want systems reliant on an individual
Availability of platform, both software and hardware, (needing to run on individual computers and requiring set up - example documentation set up for field level documentation)
Initial discovery exercise - spreadsheet Wendy set up
Look additionally at options
If we need additional cloud space we will need to request beginning in the new fiscal year
FFair(?) is set up to rebuild documentation each time a change is made
Identify what is used by each product
When you use free tools is everything exportable? yes, it just runs a rebuild script in an conformable environment with multiple platforms
https://ci.appveyor.com/project/ddibot/ddimodelDarren will take a first stab
CDI
Jan 19 would be earliest but they are still working on the documentation so will probably be a bit later
Codebook
Will be scheduled for next week to see finalization for review
CV work
Progress has been made and Oliver and Darren will meet next week
Versioning at CESSDA is getting clearer (meeting on Monday)
Profile of SKOS structure
Technical Design Specification | Metadata Profile (CVS<>RDF<>DDI L)
Requirements and Implementation Management
Needs additional work - need to determine how to go about this
DDI Agency Registry update work
Acknowledged that Colectica has completed the work on the DDI Agency Registry update and presented on the work at EDDI. Announcement will be made when video of the presentation is published so that it can be referenced in the announcement.
ATTENDEES: Wendy, Jon, Dan S., Oliver, Larry, Barry
Review of Work Plan:
XKOS:
Talk to Franck and settle on a process
Light way would be to do it to the list - set up a comments process
Comments period for Best Practices
Meet XKOS needs and outline a basic approach to formal Best Practice papers
COGS Virtual meeting
Jon will drive meeting
Correct the input issues we know about
Focus on input accuracy, support for output needs in XMI, input of XMI as well as current XML content, sorting through entry issues and complex nesting
Easter is a problem week before or after 17th April (Oliver is available)
Oliver may have some problems with the week of April 25
Google docs for document working
BitBucket features - due dates comments
Slack possible but may not work for all
Task and process assignments
Who we want and what we want to get done, set up agenda early so we can nail it down early
Mapping Work:
Mapping and comparison is progressing and is also being discussed in the Scientific Board - we need to coordinate
Caution: we need to be clear on what is the mapping for?
Conceptual down to transformation
Managing ownership
Additional Notes:
EDDI videos will be coming out in the next few weeks as they come in - Jon will talk to Jared and probably post by track
Follow up with Jared and Exec on the period of the fiscal year given shift in work plan year to January/December (this year’s work plan goes through December 2022)