Modelling Team Meeting Minutes


Materials from External Review work at Dagstuhl Sprint

 Virtual MRT meeting 2019-01-30

Virtual MRT meeting 2019-01-30

Agenda (as in invite of 2019-01-28)

Goal: To have a draft program of work for NADDI Sprint (to submit to EB), and an initial list of proposed tasks.

  1. Description of what is needed for organizing NADDI Sprint (from Achim, with possible draft planning document?)
  2. Discussion of work topics/tasks sufficient for initial organization/NADDI Sprint

Suggestions:

    - Documentation of datum-based model application to examples of event data, aggregates, etc.

  • Agreed list of data types to be worked on

  - Existing modeling technical requirements/issues

  • Simplification of the model (i.e. less inheritance and less specialized classes)
  • Review of collections (use of appropriate UML properties, use of collections throughout the model)
  • Review of design patterns (relationship to acknowledged software design patterns, relevance of design patterns for users of the model and of the representations)
  • Review of views (definition and effective use of subsets of the model)

  - Others

  1. Plan for addressing infrastructure tasks (modeling tools, production framework & process, testing groups/liaison, etc.) to support immediate and longer-term tasks

    - Are there ideas/candidate tools which need to be written up/further explored?

  • Current status of production platform post-Berlin

    - Identify tester and potential testers

(We may not get this far, but if we have time it would be good)

Minutes DDI4 MRT Virtual meeting 2019-01-30

Attendees:  Achim, Arofan, Dan G., Flavio, Hilde, Jay, Larry, Oliver, Wendy

Apologies: Jon

1.      Description of what is needed for organizing NADDI Sprint (from Achim):

A goal for the meeting is to have an agreed document regarding the NADDI Sprint planning ready to send to the AG to inform their discussions at their next meeting, and further to apply for funding of the possible sprint.

Achim prepared and sent out the document ‘NADDISprintPlanning.docx’ to the srg list in advance of the meeting. This is a shell document where some content of the document needed to be filled in or reviewed and agreed while the meeting.

The meeting was structured in three parts: 1) Topics for the possible NADDI Sprint; 2) Review of possible participants and funding; 3) Other organizational issues regarding the possible NADDI Sprint.

1) Topics for the possible NADDI Sprint (see point 2 in the agenda)

a) Documentation of datum-based model application to examples of data structures (to be discussed and agreed at the meeting which data structures to focus on at the Sprint).

b) Discussion and possible resolution of structural model issues:

  • Simplification of the model (i.e. less inheritance and less specialized classes)
  • Review of collections (use of appropriate UML properties, use of collections throughout the model)
  • Review of design patterns (relationship to acknowledged software design patterns, relevance of design patterns for users of the model and of the representations)
  • Review of views (definition and effective use of subsets of the model)

Status: Points a) and b) agreed as topics for the possible NADDI Sprint.

Discussion and agreements regarding example data structures a)

Example data structures (point a) were discussed after the structural model issues (point b).

The discussion regarding which data structures to focus on as examples at the possible NADDI Sprint was centered on whether to focus on common vs. complex cases and corner cases.

Dan G. pointed out the importance of modelling complex cases, as more common or simple cases would then be solved at the same time. Others pointed out that issues could occur even if a similar approach is used. Agreement was reached to focus on the common cases as a preparation for the possible NADDI Sprint.

Status:  Agreement was reached to focus on the following common data structures for the possible NADDI Sprint:

  • Rectangular data
  • Event data (wide and narrow data)
  • Single datum points
  • Multidimensional data like data cubes and aggregate data

Arofan and Wendy pointed out that the Variable Cascade documentation (provided for example in the Variable Cascade presentation from the Dagstuhl workshop DDI Train-the-Trainers 2018)  indicates the style and level of information needed for documentation.

        Status: Wendy will add this as a prototype review comment.

After NADDI further data structures may possibly be explored, for example NoSQL (non SQL) data like Hadoop data, graphs etc.

         Status: Agreed

In the Appendix an example from the discussion provided by Larry is found.

Discussion and agreements regarding structural model issues b)

Structural model issues (point b) were discussed before the example data structures (point a).

Conceptual resolution/MRT: Jay brought up the issue if structural issues could be resolved conceptually or by using the MRT approach. Flavio pointed out the need to look at many different examples to check out structural modelling issues. Achim indicated this could be a topic for the possible face to face meeting and something for a work group to focus on in advance.

Complexity of the model: Flavio commented that the model is complex because it is made complex. It has multiple levels and covers both common and domain specific needs. To simplify the understanding, some of the content could for example be hidden for specific user groups.

Achim asks if the model can be improved by focusing on questions like:

  • What is really the core?
  • What are the fine-grained details?
  • What are domain specific things?

Work regarding the complexity of the model could be done in advance and brought to the sprint.

Review of views: The revision of Views is important. Achim points out that even a simple view like the Agency view drives in a lot of classes.

Flavio points out that Views are complex because they currently are designed to cover multiple dimensions. The Classification View is for example meant to cover reuse, classification management and publishing. This and other views would need separation into smaller sets to be easier to understand.

Larry expresses that the model currently is highly connected but that good documentation can help the understanding.

Status: Agreement to focus on the four bullet points under b) above for the planned Sprint. Tasks should be broken up as much as possible. Smaller groups could work on each of those and get back with a proposal for the full group after a week or two. A specific person should be responsible to follow up on the work on each task.

 2) Review of list of possible participants and funding

The following agreement was made:

 The following people would be available in person for this meeting (their need for funding in parenthesis):

  • Achim Wackerow (travel, accommodation, food)
  • Arofan Gregory (travel, accommodation, food)
  • Dan Gillman (accommodation, food)
  • Flavio Rizzolo (lives in Ottawa)
  • Hilde Orten (to be clarified)
  • Jay Greenfield (accommodation, food)
  • Jon Johnson (accommodation, food)
  • Larry Hoyle (accommodation, food)
  • Wendy Thomas (accommodation, food)

Most of the people would need funding from the DDI Alliance as specified in the NADDISprintPlanning_1_0.docx document.

Oliver Hopt would be available by phone.

3) Other organizational issues regarding the possible NADDI Sprint

Possibilities for meeting location and lodging have been checked out and booked by Flavio and Achim as follows:

  • Two meeting rooms at StatCan for Tuesday and Wednesday
  • StatCan is closed on the Monday due to Easter. A hotel can be used for the Monday meeting for additional costs and a room is booked.
  • 12 rooms are reserved at the hotel. The price is a bit higher on Sunday and Monday then on Tuesday and Wednesday, due to Easter.

Two documents are sent to the AG for their feedback prior to their next meeting (also sent to the srg list):

  • The MRT DDI4 Core proposal document (MRT_DDI4Core_1_0.docx) - sent by Achim on Monday 28th.
  • An agreed, updated version of the NADDI Sprint Planning document (NADDISprintPlanning_1_0.docx) – sent by Achim after the meeting on Wednesday 30th.

Further follow-up is required regarding organizing the start-up of the work, and making plans for what needs to be prepared in advance of the possible NADDI Sprint.

Appendix

Example from Larry related to discussions of point a):

With the ability to describe data at the datum level DDI should be able to describe data like that in the following example through transformations from traditional rectangular (wide) layouts into key-value (tall) representations.

 DDI4 can currently describe the data in the wide layout, but, though we have discussed how to do the tall representation, that work has not been completed in the model.

 Wide data table:


Corresponding tall representation:

Transformations between these layouts are common in data software packages. The SAS code below shows the transformation from the wide to the tall.

Note that in the Tall representation the column Source is a pointer to a variable in the wide layout. The column Value1 is not a traditional variable, in that there is no one value domain or concept associated with the whole column, instead those things depend on the pointer in Source.

If we can properly describe datum level metadata we should be able to describe the value domain and concept associated with the “yes” category label (which is actually a code of 1 in the SAS dataset) in the Value1 column. We should also be able to describe the meaning and units of measurement of the value 185 in the same column.

 Proc format;

 value yn

   1="yes"

   2="no"

   ;

/* example rectangular file */

data fooWide;

input Name $ Height Answer;

label Name="Person name"

      Height="height in cm"

       Answer="Answer to 'Are you hapy?";

format Answer yn.;

datalines;

Joe 185 1

Mary 160 2

;

run;

proc sort data=work.fooRect;

by Name;

PROC TRANSPOSE DATA=fooWide

     OUT=WORK.fooTall(LABEL="Transposed WORK.FOORECT")

     PREFIX=Value

     NAME=Source

     LABEL=Label

 ;

     BY Name;

     VAR Height Answer;

     format Value1 yn






 Virtual MRT meeting 2019-01-23

DDI4 MRT Virtual meeting 2019-01-23

Agenda (as in invite of 2019-01-23)

Goal: This meeting should get us to the point where we are ready to propose a formalization of this effort to the DDI Executive (or take other steps necessary for approval). To that end, the following agenda is proposed: 

Agreeing the document as regards:

- Organization

- Scope

- Timeline

Details on organization and scope:

- MRT Lifecycle feedback loop

- Status of the sub-groups (see new version of document, section on organisation and structure, as well as section 4 in the minutes of last meeting).

- Alignment of other standards, provenance (see new version of document, section on alignment with metadata structures in DDI4 Core, and discussions in the Appendix of the minutes of last meeting)

- Finalizing the document and process for approval: Things to be added, changed or removed – or approve new version as is?

Minutes DDI4 MRT Virtual meeting 2019-01-23

Attendees:  Achim, Arofan, Hilde, Jay, Jon, Larry, Oliver, Wendy

Organization and scope:

Basis for the discussion:  Document updated by Arofan with input from Achim, ‘MRT_DDI4Core_Diff_0_2_and_0_3_JW’, attachment to email from Achim to srg list 2019-01-23.

The goal of the meeting was to finalize the MRT-DDI4 Core document to be sent to the Advisory Group for their comments.

         Jay suggested to start to plan work tasks as well at this meeting.

                Status: Agreement to take on tasks later on, and to prioritize the finalization of the document at this meeting.

Discussions regarding the document:

-Organisation and Structure: Achim points out the Core group guides the whole effort, defines sub-tasks and assigns a responsible person for the task who reports back to the full group. The feedback loops should be done in short, iterative cycles. Tasks are not long term, and should be discrete and well-defined. Important that the document reflects this.

        Status: Agreed

-Role of the MRT in the organization: Wendy asks how the MRT relates to other DDI groups. Larry points out that this is a new way of organizing the work of the Modelling Team.

        Status: Agreement that MRT replaces the Modelling Team. The work of the group should be well aligned with the Advisory Group.

-MRT feedback loop: The requirements for the feedback loop were discussed. Achim points out that the requirements are just a repetition of earlier goals of the Moving Forward project.

Proposals for amendments to bullet points (for clarification purposes):

-Remove ‘if required’ from the ‘looseless roundtrip’ bullet point.

-Add ‘Stability’ to the ‘Consistency’ bullet point.

-‘Persistence of the model’ change to ‘Persistent expression of the model in canonical form’ (not to be confused with canonical XMI).

       Status: Agreement to update the bullet points accordingly.

-Mapping of DDI4 to earlier versions of DDI: This was discussed at our last meeting (January 16th). Larry points out that conformance and divergence with previous versions of DDI should be clearly defined.

Status: Agreed to include a section on this in the document.

-Alignment with other metadata standards: Jay asks if SDMX should be mentioned in the list of standards included in the document. Arofan points out that the document indicates ‘at least’ which standards the DDI4 Core should be interoperable with.

        Status: Agreed to highlight ‘at least’ in the document.

-Production process: Jon asks if the production process should be mentioned in the document. Arofan points out that this is a big and important topic that needs to be addressed. We will need to come back to what the options are.

Status: Agreement that the production process should be identified in the document as something that would need to be addressed.

Timeline:

-Timing of the DDI4 Core work.

Status: Agreement for the DDI Core work to take place in rapid cycles of weeks, not months. A calendar year is the anticipated goal. Leave wording in the document as stands.

-Timing of the finalization of the document: Wendy, Achim and Arofan points out the importance of finalizing the document before the next meeting of the Advisory Group (scheduled to next Wednesday).

        Status: Agreed to finalise the document for Monday and send to the AG for their comments. Arofan send to the MRT group today (on the 23rd ) for comments.

Other:

NADDI Sprint: Achim proposes a face-to-face meeting with the group after NADDI, of possibly three days, and asked if people in the group would be interested in this. All participants on the call indicated that they would be interested. Their possibilities for attendance and dependencies are specified below:

  • Achim (needs funding from the DDI Alliance)
  • Arofan (needs funding from the DDI Alliance)
  • Dan G.
  • Flavio (will hopefully be able to attend – he lives there)
  • Hilde (needs funding from NSD)
  • Jay (needs room support from the DDI Alliance – will cover his own travel)
  • Jon (depending on acceptance of abstract for the NADDI conference).
  • Larry
  • Oliver (will attend virtually)
  • Wendy (needs funding from the DDI Alliance)

Possible topic for the agenda of the NADDI Sprint: Flavio proposes to focus on the modelling requirements.

                Status: Agreement that Achim follows up regarding the possible NADDI Sprint with the AG, contact the local organisers regarding possible localities etc.       

 Virtual MRT meeting 2019-01-16

Agenda (as in meeting notes from of 2019-01-09)

Organizational approach of where we are going and how we organize the approach
Scope, focus, groups, approach proposal
Who else needs to be recruited to make this functional
What is the approval process
OUTCOME: draft for approval

Modeling technical requirements - need to provide a summary for comprehension
Platform questions - approach to addressing this

SPARKX cloud modeling approach for UML modeling https://www.sparxsystems.com.au/enterprise-architect/cloud-services/cloud-services.html
Question of production process - where this fits
XMI output - canonical approach

Minutes DDI4 MRT Virtual meeting 2019-01-16

Attendees: 

Achim, Arofan, Dan G., Flavio, Hilde, Jay, Larry, Oliver, Wendy

Topics:

At the meeting the organizational approach was discussed as described below.

UML modelling tools was discussed by email between the last meeting and this meeting by Flavio and Achim (see information under 8) below as well as the full correspondence in the appendix).

Provenance issues and relationship with other models were also discussed between the meetings by Flavio and Jay (also under 8) below and included in the appendix).

Organizational approach:

Basis for the discussion:  

A) Document ‘MRT_DDI4Core_0_2’, attachment in email from Arofan to srg list 2019-09-01.

B) Achims questions to be clarified in order to build a good basis for the work next year, in email from Achim to srg list 2018-12-12. Achims questions (1 – 8) and their workflow status are specified below:

  1. Is there an agreement that on Modeling, Representation, and Testing replaces the existing Modeling Group?

Status: Agreed

2. Focus on DDI 4 Core, like Conceptual, Data Description, and Process. These areas are important for any use case perspective. Additional areas can be identified according to business requirements. But the focus on a core increases the chances to have a robust and  mature deliverable.

Status: Agreed to focus on the core

3. Description of major tasks regarding major modeling issues

               -Provenance was brought in as a new topic in a discussion between last week’smeeting and this meeting, see point 8) and the appendix.

Status: Needs follow-up discussions

4. Participants and their roles/perspectives

-Proposal in the ‘MRT_DDI4Core_0_2’ document to have an administration group (coordination team) with sub-groups.

Achim suggests that smaller groups can work independently on different things and get back to bigger groups with recommendations.

        Status: Agreed

-Participants of the MRT coordination team:

Arofan suggests that this group is the MRT group coordinating team, together with Jon who has also expressed interest in this.

Status: Agreed

-Sub-teams: In the ‘MRT_DDI4Core_0_2’ document, three possible sub-teams are proposed, all with an identified lead: a modelling sub-team, a representation sub-team (with sub-teams for each representation, xml, RDF, Phyton etc.), a documentation sub-team and a testing and representation sub-team.

Larry expresses a concern for the idea of fixed sub-groups due as there may not be enough people for this.

Achim proposes to think in terms of more ad hoc task oriented sub groups. Invite external experts when needed.

         Status: Needs follow-up work

-Perspectives:

A task proposed by Larry that also regards modeling (question 3 above) is to have DDI2 mapped into the DDI4 Core for the end of the year.

Comment from Achim: This work can identify missing pieces and flaws in the modelling. Not sure if all can be resolved by the year, but should be possible to identify them. Do mapping first and then modelling.

Comment from Arofan: Transformations should be developed.

Flavio: A modelling tool should be used rather than excel for the mapping.

Status: Agreement about the requirement about the mapping of DDI2 to DDI4. Needs follow-up work.


Task proposed by Achim:  Work on data description usage for different representations and data forms, for example unit record data, event long/short, aggregate/cube, single datum in a lake. Detailed tasks should be developed.

Example data that can be used for this purpose are data from the Australian Election Study (Larry) and the ESS (Achim), the Alpha Network (Jay), possibly others. The Alpha Network has lots of different data types. Several of the relevant data sets are structured in DDI2. Where real data are difficult to provide or new, like a datum in a lake, made up examples should be provided.

Status: Agreement on the task. Needs detailing of sub-tasks , follow-up work, who’s involved etc..


Tools:  Flavio: Need to decide which tools to use in our work (see also discussions under 8) and discussion emails in the appendix).

Status: Needs follow-up

-Administrative work:

Hilde does the meeting minutes

Status: Agreed

       Further administrative work, chairing etc.

                Status: Needs follow up.

5. Is the proposed timeline for a DDI 4 Core at end of 2019 reasonable?

Status: Agreed working goal.

6. The development of the business requirements document can be worked on in parallel but is not task of this group.

Comment from Arofan:  Some requirements identified by this work that affect the core areas could be interpreted as technical requirements and fed back to the MRT group and be a task for the modelers.

Comment from Achim: The focus of the MRT approach is the goal of a stable DDI4 core in one year. The focus on business requirements should not be tied to close to the task in order not to delay this process. It would be important to distinguish between what we need to have in to have a functional DDI4 core, and what can be added later.

Comment from Flavio: Agrees with Achim. For each step in the MRT cycle it should be decided what it would make sense to include. The MRT Coordination group should decide on this.

Status: Achims proposal agreed.

7. Information on these agreements to other groups and DDI Alliance committees

A goal is to finalize a document that takes into account the agreements made regarding the MRT approach. The document about agreements which will represent our proposal for the DDI4 core, will be developed and sent to the SB and EB for their approval, as well as to other groups.

Achim and Wendy: The timeliness of this document should be decided on the basis of decisions regarding the work of the group.

               Status: Agreed

8. Identification of issues which can be worked on in the next couple of weeks independently of group meetings.

                - To do for the next meeting (2019-01-20)

Hilde: Will post the minutes

Arofan:  Will prepare agenda for next week with input from others and send out invitation to the meeting to the srg list with agenda and meeting link prior to the meeting.

                 All: Think about the open issues from this meeting.

- Contributions since last week’s meeting (2019-01-09)

                UML tools discussions between Flavio and Achim:

In an email to the srg list of 2019-01-10, Flavio points out that we need to decide on a platform for developing, managing and sharing UML models. He proposes to use EA Sparx. The Canonical XMI support needs to be checked out. As a response to this Achim replies in an email to the srg list of 2019-01-15 that this topic needs to be discussed in the MRT and TC groups. Achim suggests to use an open tools solution, on the background that DDI is a standard, and we cannot risk that a DDI model can be used only in one tool, costs can be an issue regarding commercial tools etc. The Canonical DDI4 XMI has proved importable by many different UML tools. The problem is that most UML tools provide a custom XMI flavor rather than canonical XMI. Achim recommends to look into Eclipse UML tools, which have an XMI flavor that more easily can be bridged to canonical XMI than many other tools. Bridging might be supported by the Eclipse community. The Eclipse tools can do many other different things, for example to enable transformations from PIM to     PSM.

                The slide below provided by Achim that describes possible usages of Eclipse tools for MRT purposes. See the email conversation in the appendix below for the full argumentation.

                Status: Should be looked further into.


Provenance/lineage and relationship with other standards discussions between Flavio and Jay:

In an email to the srg list of 2019-01-11 Flavio brings up the question of supporting different types of lineage/provenance, and asks if everything we need to capture can be captured by Prov-O or if different standards are needed.     

Jay replies to this by providing references to a review of several provenance models, and some articles.  Jay proposes further to form a provenance sub-group to look into this.

He also raises the issue of DDI should copy or plug and play with other models, for example SDMX. In two different emails to the srg list of 2019-01-16, Flavio replies that he believes there would not be resources available to do the plug and play with standards, and he believes that DDI should specialize in a small, well-design and well-integrated set of classes to cover the aspects of the data (and metadata) lifecycle that other vocabularies either don't cover or cover poorly. See the full discussion in the emails of the appendix.

Status: To be discussed

Appendix

Email correspondences between meetings between Flavio and Achim on UML tools and Flavio and Jay on provenance and relationship with other tools:

UML Tools:

Email from Flavio to srg list of 2019-01-10

Hi Achim,

We need to make a decision on a platform for developing, managing and sharing UML models. A well-know tool for that purpose is EA Sparx -- We use it at StatCan and in some UNECE HLG projects, e.g. GSIM, CSPA.

Sparx allows users to create arbitrary views on-the-fly by dragging and dropping objects from the underlying model. This way it is possible to deal with smalls subsets of the model at a time during development, and also to target different audiences for communication purposes. It also supports BPMN. The model can be exported to multiple programming languages.

Here you will find some pricing information for the cloud version:

https://sparxsystems.com/products/procloudserver/purchase.html

This is for stand alone licenses:

https://sparxsystems.com/products/ea/shop/index.html

I guess the big technical question, beyond design capabilities, is which version of XMI is supported to make sure we can import/export the model to other platforms. There is some information here, although not conclusive:

https://www.sparxsystems.com/enterprise_architect_user_guide/13.5/model_publishing/exporttoxmi.html

We can always test it and ask Sparxs for more information on their XMI support.

Best,

Flavio


Email from Achim to srg list of 2019-01-15:

Hi Flavio,

Thank you for bringing this up. This will be a question to be discussed and decided by the new MRT group and the TC.

DDI claims to be a standard. In this sense we should try to use a solution which is open for different usages and different users. We should avoid any dependency of a specific tool. I mean this in the sense that there is a risk that the DDI 4 model can only be used in a chosen UML tool. This might be appropriate to a homogeneous environment like a (large) organization. But the requirement in a standards environment is different.

The DDI 4 model is a library. The library should be offered in a way that the library or subsets of it can be used for many purposes. A chosen tool should not be a barrier.

The Canonical XMI format proved to be able to be imported successfully in many major UML tools. In this sense, the DDI 4 as Canonical XMI would be the portable format. This is useful for people who are using directly the model with different tools, i.e. for generating a representation or combining the model with other models.

The issue with the Canonical XMI format is (currently) that most UML tools don't export Canonical XMI but only a custom XMI flavor. MagicDraw and probably Eclipse are the tools which export a XMI flavor which are closer to Canonical XMI.

A general workaround could be to choose a specific UML tool (or to recommend one) and to write a converter from the specific custom XMI to Canonical XMI. (I did this for the XMI flavor which is exported by Lion).

This way, both would be available, a UML tool for model development and a portable XMI format which can be imported into other UML tools.

Another dimension is the cost issue. Any commercial tool costs something. This might be an issue in the standards environment. Enterprise Architect versions start at 229 USD; additional costs apply to software updates.

Enterprise Architect exports to the UML/XMI version 2.4/2.5. (UML 2.4.1 seems to be the most implemented version, 2.5 is the latest.) The issue is that the exported XMI flavor can't be imported in other UML tools. Only MagicDraw offers a custom import of Enterprise Architect XMI 2.1.

There might be possibilities to get free licenses from commercial tools for standards development. I heard that regarding MagicDraw. The issue here is that companies would probably offer only very few licenses. This way, not the whole MRT group could use the UML tool. Furthermore, other users of the model would need a paid license to use the model.

The free and open-source tool Eclipse Papyrus seems to be a good choice. I suggest to look into the Eclipse UML tools in general. Eclipse UML tools use an own custom format for serialization of models, EMF Ecore (Eclipse Modeling Framework), which is available as XMI. A converter would be required for transforming the Ecore XMI format into Canonical XMI. An additional Ecore serializer could be written for Canonical XMI. This might get support from the Eclipse community.

The whole Eclipse UML tools landscape offers much more. There are tools based on the OMG standards QVT Operational (model to model) and MOFM2T (model to text). These tools would enable a transformation from PIM to PSM, and generation of representation encodings (like XSD and OWL). I looked into this a little. My thinking is described in the attached file.

There are Eclipse tools like EMFStore ("repository to store, distribute and collaborate on EMF-based entities (a.k.a. data or models)") and EMF Compare (comparison and merge facility for any kind of EMF Model). It sounds promising but I didn't have a closer look into this. It would need more exploration.

The DDI 4 model uses only a definitive subset of UML class diagrams. This approach builds the basis to create a robust and easy-to-use model which can be used in multiple environments and which can be represented in multiple encodings (representations). A similar approach should be used regarding the UML tool, i.e. using only core features of the tool. This approach can avoid dependencies.

Cheers

Achim

Provenance and relationship with other tools standards

Email from Flavio to srg list of 2019-01-11

I mentioned in the MT call that we needed to support different types of provenance/lineage. In particular, I'm interested in the so-called why and where provenance. For definitions, please see "Provenance in databases - why how and where - FTinDB 2009" (attached), Sections 1.1.1 and 1.1.3. 

There are many other references, including the original Buneman et al. paper, but this one gives the gist of it. 

Can we represent everything we need to capture these types of provenance with Prov-O or other standards? If yes, how? Else, what is missing?

Food for thought.

Flavio


Email from Jay to srg list of 2019-01-16

Here is a review of several provenance models including the so-called W7 model that Flavio is interested in: http://dcpapers.dublincore.org/pubs/article/viewFile/3709/1932

Here are a couple of articles that align one of these provenance models — PROV-O — with sensor data description and the Internet of Things (IoT):

Sensor Data Provenance: SSNO and PROV-O Together at Last 

Provenance in Systems for Situation Awareness in Environmental Monitoring 

I imagine, in line with suggestions made by Arofan,  that we will want to form a working group on provenance that will make recommendations.

I am thinking that the provenance discussion also raises a larger modeling issue: does DDI intend to copy or plug-and-play other models? We have had this discussion before with Dublin Core. But perhaps we may want to revisit it. That’s because now in DDI 4 we have to decide about SDMX. Because of ongoing UN and EU work, SDMX aggregate data description is sometimes a requirement. In DDI 4 we can continue with the nCubes we currently support in DDI 3.x and perhaps perform an SDMX transformation, we can copy “essential" parts of SDMX or we can perhaps plug-and-play with SDMX. This is really a good use case for thinking about how in the future DDI plans to align itself with other models and standards.


Email from Flavio to srg list of 2019-01-16

Thanks Jay. I need to dig deeper on your references to see whether W7 describes provenance at the datum level. Either way, the first reference is a great summary of approaches.

Regarding your question about whether DDI should copy or plug-in other models, I tend to lean towards the latter. The large number of use cases and vocabularies out there, most of which are under active development, makes it unfeasible for a small team like ours to replicate in DDI. Besides, there is no need to. We just need to understand the use case, the existing vocabulary we'd like to integrate, and create some minimal anchor objects, if necessary, to be able to plug the vocabulary in. I believe DDI should specialize in a small, well-design and well-integrated set of classes to cover the aspects of the data (and metadata) lifecycle that other vocabularies either don't cover or cover poorly.

My five cents.

Flavio


Email from Flavio to srg list of 2019-01-16

Another vocabulary we probably need to integrate with is PMML, for predictive models:

http://dmg.org/pmml/v4-1/GeneralStructure.html

Here is an example of what the model looks like, from a work some folks at StatCan are doing on the health domain:

https://github.com/Ottawa-mHealth/predictive-algorithms/blob/master/CVDPoRT/Reduced/Female/model.xml

Flavio

 Virtual meeting 2019-01-09

ATTENDEES: Arofan, Wendy, Flavio, Larry, Dan G., Jay

Technical Business requirements:
Model needs to do:
- Business requirement need to go into details of how to use the standards and how to use them
- Lineage and permanance (data point and record level)
- Provenance capture - why, how, and where provenance of a datum
-- each individual cell and the content of it as well as the record level, set level
-- how the data was changed by analyst such as due to a government shut down or change in algorithm
- This needs to be elaborated because this is not a specific business area requirement but a general need
- Use case - BLS is facing this issue and Dan would like to work with this also
- Jay has been using process model plus SDCL with ALPHA at the datum level through data set
-- building out that capability - idea is to see how well we can map DDI4 Data management view to PROV-O
-- PROV-O is very generic and so seems to work best at the higher data set level so it may be useful to provide more detail
-- Transformation between variables (source, observation, transformation)
Meeting with this topic and of drilling down specifically into the model to see how we get down to a datum or up from a datum

Technical Modeling Style requirements

Organizational design
- Arofan's documents
- proposal is to create a group that replaces the old modeling team plus a coordinting team
-- modeling - technical requirements handling which feeds into representation groups
-- for each representation
-- liason team (working with projects doing testing) - business requirements
-- documentation
Example: We have a business requirement regarding data lineage in StatsCan which would be a liason issue feeding into modeling team

Patterns to help in the tooling so to reflect the pattern base in the representations
Patterns need to be discussed from a technical point of view and the business point of view (implementers and content managers have different needs)

Arofan's email summarizing business requirements

I volunteered on the just-ended call to send out an e-mail regarding the business requirements activity which came out of the Berlin Sprint discussions. Until we organize more formally, this will just be a topic on this (the SRG list) and perhaps we can schedule a call if needed. (I have access to a WebEx so we can do meetings without conflicting with normal DDI calls if that helps).

I would like to summarize the things that I am aware of which are relevant to pursuing the creation of a business requirements document which we can put forward for agreement:

(1) I wrote a high-level document of feedback from the Cross-Domain (second) Dagstuhl week, during the Sprint. Jon has taken this and started extracting some basic scope for framing up actual business requirements.

(2) Flavio has volunteered to document some of the business requirements from an official statistical perspective. Jay, as the DDI liaison to UN/ECE, has offered to help him with this.

(3) Kelly identified during the Sprint that the Prototype feedback in fact contains business requirements, which we will need to surface into whatever document we create, at some point.

(4) Wendy asked on today's call that we identify any areas where we think there might be dependencies/points of contact between this work and other efforts within the MRT work that is now shaping up. I can see that there will be some - certainly the requirements for the business purpose served by the UML model itself (and the style of it) are already on the table, from the Daghstuhl feedback document. I am sure there are many others. We will need to focus on this as we move forward.

(5) It was also suggested on the call (was this Jon?) that the whole MRT proposal, along with this parallel effort regarding business requirements, needs to be presented to the leadership of the Alliance for approval/discussion. This suggests that we may need to create a very short document describing what this activity is and why we see it as important.

I am sure other things may be going on in regards this work which I have not mentioned above - please add anything you see as important.

I think it is early days to organize a call, especially with the holidays approaching, but we should at least try to figure out how best to move forward in the interim. One major question is finding out who wishes to be involved. The names mentioned above are clearly interested (Jon Johnson, Jay Greenfield, myself, Flavio Rizzolo. hopefully Kelly as project manager) but who else would like to get involved? I am sure this is a broader group.

If you are interested in helping frame business requirements - not technical requirements - for the DDI 4 work, please respond to this e-mail.

Also, if people think that an organizing discussion before the holidays would be useful, please speak up. We can easily arrange something.


Next weeks agenda:
Organizational approach of where we are going and how we organize the approach
Scope, focus, groups, approach proposal
Who else needs to be recruited to make this functional
What is the approval process
OUTCOME: draft for approval

Modeling technical requirements - need to provide a summary for comprehension
Platform questions - approach to addressing this

SPARKX cloud modeling approach for UML modeling https://www.sparxsystems.com.au/enterprise-architect/cloud-services/cloud-services.html
Question of production process - where this fits
XMI output - canonical approach

 Virtual meeting 2018-12-12

ATTENDEES: Wendy, Larry, Dan G., Arofan, Achim, Hilde, Jon, Flavio, Kelly, Jay

Proposal regarding an MRT group to replace or expand the MT
--Needs to be approved as a structure
--Relationship of proposed group with working groups (Data Description, Data Capture, etc.)
--This group can make a proposal to the AG so they can discuss
--Business requirements needed
--If someone from the core MRT team is in contact with a testing team
--Role of each individual needs to be identified
--Arofan thinks he may be able to be more involved, everyone should think about their participation and role, Hilde may be able to join
--At a governance level there is the issue of focusing on the core and whether a year is reasonable. Needs input from the Scientific Board. This is a bit of a re-boot and need to clarify goals for 12 months.
--Prepare to bring into the broader Scientific Board prior to the May meeting
--Focus on short term goals / sprint like
--Get a document out in the near future to
--Making sure that response to Prototype review filters into this
Business requirements document
--Keep as a separate parallel activity for the time being
--Start and then continue detail into February/March
--Having an outline of these requirement should be part of the proposal
Technical requirements - collections discussion, attributes, development process
--Roundtripping, modeling, etc.
Modeling rules for UML - this is needed to define the input and validation to COGS
--This needs to be well thought out in terms of what is the "core of the core" and the principles of what we are expressing in UML
Addressing issues raised in Prototype review and assigned to the MT

What are the real next steps in December and January. Next meeting January 9 (status check) first working meeting January 16
Interest and role in Modeling (MRT) Team should be requested on MT mailings to list over the next month or so. Something on broader list as proposal for group is expanded
What is the goals of the group document should be ready for broader distribution
Technical requirements - Flavio and Wendy
Business requirements - Arofan
Summary of roles etc. - Arofan
Project management - Kelly is discussing with Jared this afternoon, how is project management to go on with this group, we need to start determining dependencies, time requirements, resources should be part of how we work

Email from Achim prior to meeting:
In my understanding following questions should be clarified and would build a good basis for the work next year:

1. Is there an agreement that on Modeling, Representation, and Testing replaces the existing Modeling Group?
2. Focus on DDI 4 Core, like Conceptual, Data Description, and Process. These areas are important for any use case perspective. Additional areas can be identified according to business requirements. But the focus on a core increases the chances to have a robust and mature deliverable.
3. Description of major tasks regarding major modeling issues
4. Participants and their roles/perspectives
5. Is the proposed timeline for a DDI 4 Core at end of 2019 reasonable?
6. The development of the business requirements document can be worked on in parallel but is not task of this group.
7. Information on these agreements to other groups and DDI Alliance committees
8. Identification of issues which can be worked on in the next couple of weeks independently of group meetings. People will have longer breaks (I’m not available Dec 22 to Jan 13).

The intention is here to find a common ground on which basis productive work can be done.

Modeling Team was on hiatus while the Technical Committee prepared the DDI4 Prototype for review

 Virtual meeting 2018-03-21

ATTENDEES: Kelly, Wendy, Jay, Larry, Hilde

Agenda:

General Updates on content/files - Kelly et al (10 mins):

  • Do we have links to everything that has been written
  • What is currently in Bitbucket in terms of high level content
  • Documentation of production process and interface with view documentation are two different things
  • Need compiled document on Views - Modeling, Lion to PIM/PSM, Binding specification
  • Problems arise in different points in the production process and we need to determine what needs to be dealt with when
  • Pull together the Modelers documents on Views
  • Documentation of XML binding of Views - Oliver will update at end of week
  • Add examples from Sprint page
  • FHIR documentation https://www.hl7.org/fhir/index.html

reStructuredText examples - Kelly (10 mins):

https://bitbucket.org/ddi-alliance/ddi-views/src/a2c8a3d9fce4d18af096467534ac6a0718e766b8/documentation/src/userguides/variablecascade.rst?at=master&fileviewer=file-view-default

Hilde Questions - Hilde (25 mins):

  • Issues came up about the model (UML) and views
  • What should be in the View? file issues if items should be added/removed from view
  • Images can be instructive
 Virtual meeting 2018-03-07

ATTENDEES: Kelly, Wendy, Jay, Larry, Oliver, Dan

Modifications due to model changes in documentation:
--reflecting change documentation - name change list TC-41
--Views seem to be less stable
--View documentation on two levels (restricted classes needs that level of understanding to understand schema file or if doing RDF need to
--How do we flag what needs to be reviewed due to changes in model (name changes, content of Functional Views)
--ACTION: Add a flag to the change log to notify where additional documentation review should take place (Wendy - DONE)

Content and format updates
--Get Flavio to review earlier documentation to update
--Reviewed assignments for various documentation objects (see DDI Prototype Documentation MASTER)
--Linking between documents
--There are two ways in Lion (external paste in HTTP link; link from one section in the documentation to another http://docutils.sourceforge.net/0.6/docs/user/rst/quickref.html "inline internal hyperlink targets)
--ACTION: provide and example (Kelly/Jon)

 Virtual meeting 2018--02-21

ATTENDEES: Wendy, Jay, Larry, Oliver, Kelly

Kelly will present her organization of work so we can see where we are plugging in
--Folder in Google Drive that will contain prototype documentation where it can be worked on and edited (link on project management page)
https://drive.google.com/drive/folders/1-R_Zt_ECCkJmACJnewh9cNP4d9tiAfvG
--Master spreadsheet to track work on Google site
--May need to add more examples
--Can we review Dagstuhl documents for other documents that should go into the folders
--Future work will be with the documentation group but we need to pull in the documents we have
--Make sure all classes in the packages have complete and accurate
--Oliver will create a spreadsheet with all the classes and their documentation
(this will become part of the nightly build so we can pull out and see where we are at a given point)
--Review sheet for specific assignments or for volunteering

Some rules on what type of issues should be filed where so they get addressed by the right group
--Use prototype ONLY for things that have to happen for the prototype
--modeling issues should be filed in TC; Documentation in DOC; RDFOWL, XMI, XML etc in appropritate trackers, and if it needs to be done by Prototype then add a tracking issue in Prototype

Class level documentation as much as possible in next two weeks

Oliver will check on the transformation issue - XML rendering of regular expressions

 Virtual meeting 2018-02-14

ATTENDEES: Wendy, Larry, Jon, Oliver, Kelly

Role of dual TC/MT member is TC model reveiw - look at where you can best contribute
TC-3
TC-6

Meeting schedule through June - We'll meeting next week and the leave scheduled for every other week if needed to verify what has been done, what's being worked on etc.

Issues found during write-ups for TC
TC-7
DVG-27 (2 newest documents) - don't know just how these are used - how the high level connects to the detail

Want to make sure what we have is the most accurate

Documentation issues - review documents for current accuracy and move to DVG-27

DVG-27
DMT-176
DMT-173
DMT-172
DMT-171
DMT-168
DMT-162
DMT-155
DMT-147
DMT-145
DMT-137
DMT-118
DMT-115
DMT-100
DMT-97
DMT-84
DMT-83
DMT-80
DMT-72
DMT-23
DMT-18

Use cases / examples

DVG-28
DMT-182
DMT-154


 Virtual meeting 2018-02-07

ATTENDEES: Wendy, Jay, Larry, Dan, Oliver, Kelly

Agenda briefly:
--replacing targets that are pattern classes (mostly related to methodology pattern)
--if we subjectOfDesign ECVE can we set it to a specific value - document that these should match
--extend from methodology overview
--ACTION: make classes of SimpleMethodologyOverview the extension base for other realizations of these classes. Note that this has made several classes with NO additional properties/relationships whose sole use is to limit the target class of a relationship to a specific subtype DONE
--a few quick include/don't include decisions for Descriptive Codebook
--ConceptualInstrument - ImplementedInstrument is the cutoff with documentation that this is where it would link into the DataCaptureInstrument - also keeps in line with what is covered in 2.5
--ACTION: add ones in red make changes to pattern targets DONE
--ACTION: add hasVariableCollection to Study to tie in the use of requested variable collection DONE
--documents needed by documentation group
--DVG-27 is the site for dumping any and all documents, drafts, notes to feed into higher level documentation
--created DVG-28 as a site for dumpling use cases, test cases, and examples
--class documentation relation to GSIM and there are lots of issues between DDI 4 and GSIM
--Jay is working with Jason Blackwell (UNECE) on a document covering conceptual and logical models - this should be at least referenced by DDI documentation to help clarify the relationships


what does it mean to "End" at a certain point in, for example, the variable cascade
X1) include the class i.e. InstanceVariable but don't add any un-included relationship targets

2) don't include the class i.e. InstanceVariable and document that this is the linking point to a larger range of packages

 Virtual meeting 2018-01-24

ATTENDEES: Wendy, Jay, Oliver, Larry, Kelly, Dan

RectangularLayout becomes UnitSegmentLayout

Larry made change and will follow-up with documentation search and fixes

If we are going down to the datum and data point we are missing describing the Set of Units. We can say what the population was but we don't have a means of subsetting by definition. Want to lay out the issue. How would this relate to where VariableStatistics would need to be attached. Possibilities would be use of IdentiferViewPoint, Transformation Processes, possible of creating an Index or other means of addressing this.

We want to be clear on what the prototype does, what it doesn't do, and issues.

How far up the Variable Cascade
List of Functional Views --
Conceptual Content View : Concepts --- InstanceVariable
Data Description View : Datum --- InstanceVariable
Data Capture Instrument View : Capture --- RepresentedVariable
Custom Metadata View : CustomXX --- InstanceVariable
Statistical Classification View
Structured Geography View
Agent Registry View
Sampling View
Data Management Process View : Include Business Workflow and look for the edges
Descriptive Codeview View : current content and coverage

Use the Variable Cascade as a central hub of where everything plugs in thereby facilitating connection to different parts. Model hangs together how extensions can plug in (different capture modes, different storage modes, etc)

Workflow work is now stable - Business Workflow


 Virtual meeting 2018-01-17

ATTENDEES: Wendy, Jay, Larry, Oliver, Dan, Kelly

Data Description:
Logical Description - final?
Format Description See DDI4DATA-25

Rectangular - does it also include a CSV as long as all lines have the same amount of columns
Rectangular being all records of the same type and same layout
Rectangular was fixed length

SingleLogicalRecordFile - Flat? UnitRecordFile?
Single logical record - could be multiple physical segments
Can be fixed length OR delimited

MulitpleLogicalRecordFile - Hierarchical/Relational

Two things:
How many logical records are in the file
Is it fixed or delimited?
If multiple logical records - hierarchical or "rectangularized"

FlatSegmentLayout
Single type - can have multiple physical segments each must belong to a single logical record (segments are flat within their logical record)

Prototype -
Multiple logical record types
Multiple segments (that can be ordered)
Association between logical records

Cube - dimensional store
Event are other layout - Tall skinny file

ACTION:
Finalize a name of this particular physical record layout
Get a description - of this and vocabulary below
Logical and format we have what we need - double check examples

Vocabulary agreement:
What is the physical layout of a logical record (can have multiple segments)?
PhysicalSegmentLayout
A collection of physical layout formats in a single file?
PhysicalFile
A collection of closely related files (like in a relational data base)?
PhysicalDataSet
A collection of multiple files of many types within a single store?
DataRepository

ACTION:
Change name of property fileName to physicalFileName

Items 1 and 2 under Agreed are agreed
Item 3 collection of PhysicalSegmentLayouts format a Logical Record

Under Questions:
1 and 2 were agreed to
3 - working on the specific name and documentation
4 - resolved above renamed to PhysicalDataSet


Project Management Question:

How does verification of use cases move into the documentation?
Each use case has description, examples etc. how to file these for incorporation for the documentation group.
Making a list of use cases, examples, what gets handed off and contributed to during documentation period.
DMT-176 place for class documentation
Distinction between test cases and use cases - test cases are useful for describing how things work and testing out for implementing - use cases are broader (ANES, Transformations, etc.)
Use cases relate to what is covered by a Functional View
Make sure there is a place where content for pass off to next group is collected - will add issues to DVG


 Virtual meeting 2018-01-10

ATTENDEES: Wendy, Jay, Jon, Oliver, Larry

LOGICAL and FORMAT
Issue of data description (logical and physical) - what changes are still possible?
What is a tweak and what is a rework?

When Jay looks at the model now and the bindings Deirdra has been working with are out of synch and there are still some issues to finalize before entering the changes in Lion. There are lots of collections that bounce up against each other and this is being simplified.

Jay has been sending out models of the agreements. He has been testing this model and making examples. DMT-176

One of the big changes we decided that Viewpoints hung off the Unit Record Relation Structure. There are no Viewpoint relation records
Wendy will enter changes in Lion based on the most recent ppt in DMT-176

Simplification resulting from mechanical entry of collections - showed up a number of duplications of activity. These have been cleaned up.

Language of object - DMT-177
Often need to support several languages which is a problem in the XML as an attribute can't be repeated.
Solution has been to create a list of xs:language
Changes were initially made in Lion. These need to be updated in terms of documentation and cardinality. This allows the transformation to be based on the use of the datatype.

Move CDE to separate package
Move Catalog items to separate package
Evaluate time needed to update CustomVocabulary - create a meaningful view - need a use case creating Controlled Vocabulary publication option

Workflow/Process -
2nd view to support transformation
Look at GSIM business process to see what is covered/what not - BTL - GSIM information model
Main difference now has to do with collections vs nodes

NEXT MEETINGS

Get data description nailed down in the next 2 weeks. Need to finish views and enter them by end of January 31.

Jay and Larry will talk about how Format relates to logical and present next week.

 Virtual meeting 2018-01-03

ATTENDEES: Wendy, Larry, Jay, Dan, Oliver

DMT-182: Format structure issues raised in working on Use Cases. Data Description issue will discuss at next meeting.
Get resolved by mid-January for update prior to end of month

Darrin's example - can Jay see what you're doing - wendy email get his RDF

Wendy will go through remaining views and potential view and draft for discussion

DMT-148 - looked at specific issues (skip those in Qualitative)
Raised the issue of Common Data Element - difference with RepresentedVariable
Need to determine if this should be in Prototype before the end of January

Final review piece is to identify those ComplexDataTypes that are total orphans - find them and isolate - Oliver will write script for validation of this problem

Script to identify orphans in general - with package information - Oliver

ISSUES for resolution before end of January

Extensions to Workflow for Prototype - Jay will gather information, we'll create issue and determine what needs to be discussed

Add lists of classes needing documentation in other packages in Prototype


 Virtual meeting 2017-12-20

ATTENDEES: Wendy, Jay, Larry, Dan, Oliver, Kelly

Workflows:
Results of Jay's review of workflows as a basic realization of Process. Need to capture transformation as ETL's rather than statistical packages. DDI4 metadata in workflows used to try out and test work in PENTAHO. Using this to identify any new or modified classes for workflow. ComputationAction is one of the subtypes of Act (others are related instrument components for a data capture instrument). It can capture code. What is needed is a means of capturing a clear structured description in XML. The approach being used instead of capturing code, is the creation of XML that is fed into a machine to run the recode.

PENTAHO etc. start with metadata and produce the transformation using that as a driver. Suggest the use of a MetadataDrivenAction. Add the ability to add a correspondence table which would hold the relationships used for recode. The user creates the correspondence table (ex. the IPUMS transformation table). Addresses Joins, recodes, renames

Question: Are you aiming at being able to roundtrip and generate the PENTAHO from the DDI? Yes, we can do that. It's JAVA based. These systems also support formulas and can run scripts. They still want to use their statistical packages to do certain things. Don't want to do statistical analysis in a data management environment. Every time they changed data management they had to rewrite the STATA code because no one could understand another's code. If you capture the algorithm and generate the code as opposed to trying to derive the algorithm from the code. You can see what is going on within the eventual code.

Jay will send to Wendy and Wendy will put in to verify that what Jay is proposing is clear.

LION content:
reviewed and agreed on package/view dispensation DMT-140

Jay to send Kelly a write up of the decisions made in Dagstuhl regarding the logical data description, some of which have not yet made it into the model. Wendy to follow up with Jay about creating an issue for integrating these decisions in the model.

 Virtual meeting 2017-12-13

ATTENDEES: Wendy, Larry, Jay

Reviewed the work plan for December 2017 though January 2018 covering the MT review of the Protoptype. Posted on DMT-175

 Virtual meeting 2017-11-29

ATTENDEES: Wendy, Larry, Dan, Oliver

DMT-141 - change xs:string to ECVE (entered)
DMT-134 - resolved - in entering changed "broadest" to "highest" to clarify relationship to "lowest"
DMT-66 - resolved (entered)
DMT-144 - generatedBy target Act - target 0..n change after reading content target is 0..1 with a note on how to handle a series of Acts to generate the content of the InstanceVariable
DMT-148 - use base class in very common relation names like "contains"

 Virtual meeting 2017-11-22

ATTENDEES: Wendy, Larry, Jay, Oliver

Annotation

Document Information
Dealing with documents in RDF - Ben and Oliver discussion
Every triple store already cares about triples as quatiles as they represent the major box the triples come from. We could use that to identify meaningful usage as archival documents (for instance at GESIS that is put into the long-term preservation for recreation of databases) we would be safe if we carry those specific documents also into those "box identifers" in RDF. We could bring docuent info into the RDF world with the connection to the according triples. We talked about that and figured out that is currently not covered in the LOD research community. Bring this to the conference in Greece (paper deadline in January) - Oliver and Ben plus Jay and and Larry as co-authors plus asking Eric. Interesting and solid use case for that approach.

Implications for document information - none at the moment - could leave as in and recommend that for certain RDF instance bindings

Right now document information is put into all Views - should this be a deliberate selection based on the need for a persistent document rather than an interchange
Would there be cases where you'd want to use the codebook view just for exchange - even a pure interchange view could make use of document interchange properties

FOR PROTOTYPE:
Leave as it is for now
Raise the question of whether this should be available on all views
What would link the DocumentInformation to the rest of the information?

How would you use this information in a sparkle query? What you could do for instance if you already know something you want to query based on a study series. You know about this one document and you want to find the study identification to find variables. For most cases you would just use it for retrieving the provenance information on the metadata.

Annotation usage:

What annotation apply to - the metadata object
Annotation of the document information is about the - review document and Document Information and looking at a specific example. Where does this piece of information about a codebook go. Australian document that Larry is doing is a good example.
DECISION: Larry will check to see if there is any major issue that needs to be addressed ASAP. Otherwise, this could continue on through MT review.


Process Model and Methodology:
Jay's ppt on DDI Methodology as a Data Management plan as a traversal over a GSBPM/GLBPM once or in a series
Extend the workflow process with new properties that support those different types of workflows (series) examples
Jay will update and send out via DDI-SRG list to MT
We want to identify one or two possible extensions of Workflows to support data processing, GSBPM management etc.
Didn't seem to change what Larry was doing because there was the Methodology Overview which allows discursive rather than specific processes
Maybe just the Study workflow, Project management/workflow, more than "data", work plans, Process management

 Virtual meeting 2017-11-15

ATTENDEES: Wendy, Jay, Larry, Dan G., Oliver

ACTIONS:

Dan G. - review the issues at the bottom of the list with ? (100, 115, 141) and confirm that they have been addressed by Data Description. Any documentation should be added to these issues over the next month or so in order to make sure it is available -

  • These have been identified as documentation and Larry will work on providing a clear story about how these work together during the documentation period of prototype work


All - DMT-157 I've reviewed and made comment so I think we can agree and resolve it. If there is no descent then I'm happy to enter that change.

OK resolve

Wendy - will try to get geography structures ready to look at next week

Geography is almost done. It extends CodeList: 2 questions
1) Can a Unit have only a single Unit Type? (why?) - some units are in samples in many surveys expressing different Unit Types. However the Unit Type is so generic that its not as much of an issue. Extensional definition: you list all of the available kinds where you have a list of things that actually defines it. concepts can be roles rather than kinds distinquied by attributes rather than use.

2) Would an abstract base for a CodeList/Statistical Classification/etc. be easier to handle (limits extension depth, opens up for other forms of signifier/signified options)
file and raise again after prototype

DMT-159 - resolved


Annotation issues:
What is a document? When is it a document (in XML? in RDF?)? what is being annotated? what is being cited? Differentiation between say "creator of the content" and "creator of the XML binding of the content". Is it annotation of the metadata or the object described by the metadata?

Access issues: access to data, access to metadata, persistent access restrictions/rules, local restrictions/rules

In RDF you'd have some general information about a triple store but would not be able to distinguish different sources that different triples hand from. If you do quads you can have identification information on the triples. You are not able to say "give me the document root" you'll always land at the level of the triple store not the package of related triples.

https://www.w3.org/2001/12/attributions/

RDF provides not model level division between data and metadata

Jay and Oliver will work on this

Workflows in on the agenda for next week

 Virtual meeting 2017-11-08

ATTENDEES: Wendy, Jay, Larry, Dan G., Oliver

spreadsheet used in discussion

Assignments for next week

Dan G. - review the issues at the bottom of the list with ? (100, 115, 141) and confirm that they have been addressed by Data Description. Any documentation should be added to these issues over the next month or so in order to make sure it is available

All - DMT-157 I've reviewed and made comment so I think we can agree and resolve it. If there is no descent then I'm happy to enter that change.

Workflows in on the agenda for next week

Wendy - will try to get geography structures ready to look at next week

All - review items associated with Annotation/Citation, we need to determine what must be addressed for prototype and what if anything can be delayed. Also, what is modeling and what is documentation.

Oliver - add an issue to this Annotation/Citation set that addresses the issue identified in Codebook meeting as well as fuller documentation


 Virtual meeting 2017-10-04

ATTENDEES: Wendy, Jay, Larry, Oliver, Dan G.

Codebook will be reviewed for new classes, changes of identifiable to Complex Data Type, etc.

Change name of CodeItem to CodeIndicator (done)

CodeList - will always have contains with CodeIndicator and may have isStructured: ClassificationRelationStructure which points just to category (done)

Instructions should indicate that you must use contains: CodeIndicator for simple and structured CodeLists. If the CodeList is structured use isStructuredBy: ClassificationRelationStructure to provide additional information on complex structure (done)

LogicalDataDescription
issues with LogicalRecord and LogicalRecordLayout. Need to clear up critical content early next week.

 Virtual meeting 2017-09-20

ATTENDEES: Wendy, Larry, Jay, Oliver

XMI and definition of default values and regular expressions - Larry will send Oliver some XMI examples

Start entering following Flavio's review pattern and then realize where needed

Add ability to create a variable group and a statistics group (has to relate in some way to a data file)

Codebook is down to finishing up relationship to DCAP and whether to include a relationship to Concept


 Virtual meeting 2017-09-13

ATTENDEES: Wendy, Jay, Larry, Dan G., Oliver

Codebook View Review:
Status of DMT issues required for Codebook - reviewed, revised list, assigned
Status of Codebook group work
is it ready for review
What materials will be coming from the group - documentation, examples, etc.

Looked at comments assigned to Codebook during last comment period. All but 2 resolved, plan is to revisit at next meeting in 2 weeks. Can review what is there with recognition of work remaining. Checking on moving meeting up a week to help meet deadline.

Can I enter collection and realization changes for those classes used by Codebook

 Virtual meeting 2017-07-26

ATTENDEES: Wendy, Jay, Oliver, Larry, Dan G.

Current state of Collection revision:

  • Jay spent some time with Dan to walk through it. What came out of this was that we wanted to organize the thing a little more so that the statistical classification fell out of the code list which fell out of the classification set.
  • The other was do we want to represent the relationships the way Wendy was representing them there or just have views of the relations to represent the relationship and kind of deal with relationships separately.
  • How well does this reflect GSIM and how much is fine tuning or is it gratuitous remodeling. There seem to be a more straight forward way of doing things. We want to provide something that is transparent and intuitive. We want people to look at this and say "Oh that makes sense".
  • Dan didn't have time to work on stuff so didn't get into the weeds.
  • Jay provided a roadmap. How does the this relate to the representations package.
  • It was pretty clear that we started out we began with realization and a means of simplifying it.
  • We need to be able to explain relationship to GSIM and the Node/NodeSet was a means of being able to attach each of the basic things and then work on the details.
  • We have to be able to show a clean map from GSIM to DDI.
  • Not everything that is in the "pattern" package should be in the pattern. Right now that is muddied by having to put these into the same package. It is "OK" not to have it in the pattern because WE are the ones that are building the realizations not the end user.

Where do we go from here from this

  • Wendy needs to send out her notes
  • Jay and Dan need to work this out
  • We need to agree on some kind of road map
  • This is fundamental stuff and we need to get it nailed down before Dagstuhl

We need to have a clear workplan and priorities over the next 18 months

As we're working on this can we create mini-examples.

Oliver will create a means for us to make builds.

 Virtual meeting 2017-07-12

ATTENDEES: Wendy, Jay, Dan G.

Discussion of collection realizations:

  • Statistical Classification is the same as a CodeList
  • Not necessarily in practice
  • Statistical Classification needs to be mutually exclusive and exhaustive as well as managed
  • Extension of Statistical Classification from CodeList would support use of Statistical Classification as enumerated value domain
  • The difference should be between managed and unmanaged, mutually exclusive and exhausted are boolean features.
  • Add spatial relations for use as specialized classes of base relations
  • If you inherit from CodeList you can use all of these
  • The distinction between category set and codelist is having designations (signifier associated with signified)
  • A signifier is currently a string in a Code but could be an image or sound etc.
  • Dig deeper into designation down the road if we want to pull things together in this way.

Jay and Dan G. will walk through the realizations next week. Will also have a discussion during TC meeting period this week as there is no TC meeting

Wendy will review what extending Statistical Classification from ColeList would look like. Also explore implication of extending Unit Type, Universe, Population from Concept

 Virtual meeting 2017-07-05

ATTENDEES: Wendy, Jay, Oliver, Larry

Went over emails regarding collection model and realization
Talked about Universe, Population, UnitType being a subtype of Concept (using concept sets to describe collections of them)
CategorySet to CodeList to StatisticalClassification

What needs to get done for CodeBook:
Statistics - Variable and category level (DataDescription)
Vocabulary for types of methodologies - we have the generic but need some documentation probably flesh out what Sanda did in a formal document

Documentation of View capture:
Want to do documentation at the view level which is more for users of view but we don't really have the resources at the moment to add class level documentation. Figure out what kind of document is needed and what format we should use now. See what format we have and then put documentation in that format and find anything that doesn't have a home.

 Virtual meeting 2017-06-28

ATTENDEES: Wendy, Oliver, Jay, Dan G.

Proposed game plan discussions

  • Software version of the content is good but there could be other drawback

Documentation of Views:

  • YAML structures were discussed, putting these examples into Lion
  • Be able to have instances to prove and use as examples

Patterns:
Update on realizations and XML for Collection Pattern

  • Classification - maybe start with CodeList and then add levels
  • Make a view for realizations
  • Workflow
  • Logical/Format description
  • CustomValues

Signification Pattern (new issue DMT-137 describing task).

  • Please review model in terms of locations where signification pattern should be realized.
  • We need to have clearer rules in terms of realization
  • Many classes, such as concept, realize multiple pattern classes and we need to be clear on how these compliment each or conflict (if they do so)
  • The extent to which we use a pattern - there are instances where the pattern applies but may be of such a character that we aren't realizing (i.e. Identification where we can specify how ID's are formed by relationship to the pattern, but to most seem pretty straight forward). We don't want to twist people into strict knots by forcing realization of a pattern. Difference between the conceptual model and creating the binding and implementing the binding.
  • The big issue is whether people want to actually model it. At what level is that made invisible - at the point of binding or the point of an instance.
  • The sweet spot is in doing metadata management. You can start looking at ties between identifiers and how they are formed.
  • How do we go about exploring this in an efficient way. Lawrence paper should be linked to DMT-137
  • How its currently modeled is tied to nodes and so if we lose nodes there isn't a representation to hang our hat on.
  • In walking from the abstract to the concrete it had to do with the modeling. Signification would need to be pulled into at least realizations of part of the collection. May cause the review of the rule that pattern classes can't realize other pattern classes. They extend.

APPROACH:

  • Re look at the material we have
  • Review the use of Nodes and effect on signification on that
  • Create a realization of Statistical Classification or CodeList which use both collection and signification
  • Evaluate signification within this
  • Use signification only where it is useful
  • Better definition of when signification should be used and where it may be superfluous
 Virtual meeting 2017-06-14

ATTENDEES: Wendy, Dan G., Jay, Larry


Review of revised collection pattern as found in NewCollectionPattern

  • There is the idea of a collection and then that a collection is made up of a structure
  • Base collection (abstract) then specializes to a unordered, strictorder, orderrelation.
  • There is the collection which can be of 3 types - but we want a means of describing the entirety of the
  • Too much stuff in the base class - separate the thing from the underlying structure
  • What we want is a name of the whole collection just at the root
  • A "proposed collection" that then related to information about the collection and then that contains the structures
  • Creates a bag of bags as an entry point to the collection


Move from BaseCollection to NewCollection

  • type 1..1 CollectionType Binary choice of Bag or Set
  • name 0..n Name A linguistic signifier. Human understandable name (word, phrase, or mnemonic) that reflects the ISO/IEC 11179-5 naming principles. If more than one name is provided provide a context to differentiate usage.
  • purpose 0..1 InternationalStructuredString Explanation of the intent of this collection. Supports the use of multiple languages and structured text.
  • usage 0..1 InternationalStructuredString Explanation of the ways in which some decision or object is employed. Supports the use of multiple languages and structured text.

Add to NewCollection

  • (make a note that this SHOULD be realized as AnnotatedIdentifiable)
  • isDefinedBy points to base collection which is the root bag
  • [totality, semantics, hasRelationSpecification]
  • class by definition is the master collection


Follow-up from Lawrence KS..who's working on what:

  • Jay..ppt continuing (waiting on realiztions to complete) - what happens to realizations of the process pattern
  • Jay..beginning to see of the work Chifundo's doing as a means of road testing some of the pattern stuff we're doing
  • Larry..will work on examples when test realizations are done
  • Dan..probably has something outstanding and will work on test realizations


Related work:

  • Qualitative and custom should be cleaned up (Dagstuhl)
  • Dan is working on LIM
 Virtual meeting 2017-06-07

ATTENDEES: Wendy, Jon, Jay, Oliver, Dan, Larry
Issues from Lawrence

  • DMT-134 resolved - in review
  • DMT-133 resolved - in review
  • DMT-132 documentary, Oliver will clarify process and add to documentation of bindings (also need to add to property documentation - use of name "content"

Collection Pattern:
Singletons

  • A singleton is a bag so we can pick an element; one thing that we were throwing around was the idea of having a shortcut representation of where your domain would be a bag. Our language is such that it is really talking about a single thing to a bag.
  • If its easier to do this by pointing to a member than to a singleton bag. Worry about the representation when we get there.
  • If you have something that is pointing to an object and you want it to another object
  • Proposed collection example: what elements in the range are the target of the relationship
  • Larry's hierarchy issue of breaking relationships
  • One of the advantages we have here is that we are talking about a pattern, so in talking about specific realizations we need to realize in a way that prescribes its usage which say forces them to use a parent/child or part/whole relationship
  • The proof will be when we start looking at realizations
  • We need to think about what it means to have a relationship to the relationship to the specific items or to the items in the inner relationships - clarify this in documentation
  • The domain of the inner relationship should be the target of the outer relationship
  • Hierarchies are created a level at a time (if transitivity applies all the way through you're OK, if not, this could be an issue)


Layout for work this summer (prep for Codebook development review)

  • Oliver: August all
  • Dan: August most
  • Jon: end of July and all of August
  • Larry: teaching in June in afternoons
  • Jay: generally available
  • Wendy: July 19th

ACTION: Wendy will review work and draft summer work plan and send out for comment

 Virtual meeting 2017-05-10

ATTENDEES: Wendy, Jon, Jay, Dan G., Oliver, Larry
RESOLVED

  • DMT-99 - Should identifiable have a derivedFrom property
  • DMT-78 - Scope of International Identifier should be broadened   (additional material added for consideration post meeting)

Discussed and determined to be part of a larger issue on access restriction

  • DMT-90 - Should there be additional properties at the annotation level

Remaining

  • DMT-26 - Reg expression serialization in model should support multiple bindings
 Virtual meeting 2017-05-03

ATTENDEES: Wendy, Oliver, Jay, Larry
to do before sprint
DMT-99 - Should identifiable have a derivedFrom property
DMT-78 - Scope of International Identifier should be broadened
DMT-90 - Should there be additional properties at the annotation level
DMT-26 - Reg expression serialization in model should support multiple bindings
Actions taken:
DMT-16 - Incomplete list of xs:datatypes in primitive RESOLVED
DMT-66 - Population time space and Unit ON HOLD [need to complete some spatial issues first]
DMT-12 - Review use of URN as opposed to URI RESOLVED [moved to RDFOWL]
Moved to sprint
DMT-105 - Abstracts in models: usage rules
DMT-109 - Create a clear definition of what a view is
DMT-112 - Review ExecutionPair in ComplexPattern
DMT-72 - DocumentationInformation

 Virtual meeting 2017-04-26

ATTENDEES: Wendy, Jay, Larry
Variable cascade - Dan is not available today

Collections

  • Asymmetric Relation is a bit of a puzzle. It doesn't make sense mathematically. Ordered Tuples as represented in graphs doesn't have a source and target. A vector is an operation on a Tuples. Lisp has an operation which returns a beginning and the rest which is how you distinguish the role of members in the Tuples. Other languages don't work like Lisp and have different operations. Like a linked list or doubly linked list.
  • There may be way accomplish this in a simpler way.
  • What Dan said: when we model patterns and kind of do more process stuff and less methodology things we want to think about the temporality of things we want to things about things that go back and forth (from a study or a measure, then changing as we learn more and execute it, when we become interested in execution and what happens during it, and then when we refocus on dispersal, use, understanding, replication and platform independent being human readable, platform dependent being actionable/executable.
  • Distinction between algorithm and process. Went round and round and maybe no distinction. If you describe a coding operation and the algorithm but in doing it you used a black box procedure in a specific software.
  • The example that comes up in a statistics class in determining error
  • Realization: If you want to describe the actual execution you're stuck with binary relations because you need to talk about the interfaces between steps. But do we need that everywhere for everything? Can we describe things that are not so entangled in a simpler way such as the flow of a questionnaire. We have to know at what point we need the binary relations.
  • We'll pull this together now on the issue and ask for Flavio's reaction. Then nail down during Sprint. (use DMT-116) Underlying model of DDI 4.

Sprint prep:

  • What gets addressed?
  • What outcomes are required?
  • What do we need to do in preparation?

to do before sprint

  • DMT-105
  • DMT-72
  • DMT-16
  • DMT-109
  • DMT-66
  • DMT-12
  • DMT-99
  • DMT-78
  • DMT-90
  • DMT-112
  • DMT-26
  • DMT-90
  • DMT-91
 Virtual meeting 2017-04-19

ATTENDEES: Wendy, Dan G., Jay

  1. Dan made progress on variable cascade report by next week hopefully
  2. DMT-110 resolved and entered
  3. Collection model

Collection model discussion

Discussed relation to Node/NodeSet
Collapsing Collection with Ordering/Unordering
Jay will talk with Dan G. on Friday to bring him in on this conversation
We will want to pull these ideas together and see if we can get some feedback/comment from Flavio

 Virtual meeting 2017-04-12

NOTE there was not meeting 2017-04-05 due to the NADDI Conference

ATTENDEES: Wendy, Jay, Larry, Oliver
Node/NodeSet

  • What is the status of Arofan? As he was involved in the original solution of Node and GSIM.
  • Larry was not involved in the conversations and so it seems to just add a layer of complexity.
  • Meant to solve 2 different problems at the same time
  • The way we put together model in terms of an abstract on top of an abstract and something we should revisit. We need to think of this more broadly in terms of the patterns and how they are linked. We need to be clear what the issues.
  • Look for past notes from Toronto or Vancouver
  • Get this well set up for Sprint.
  • In terms of what we've done we never refactored once we've made the Collection Pattern and ended up complicating the Node/NodeSet with the pattern. If we had started over and refactored how this works together. If we do this there would obviously be a lot of changes to the representation package. It would be justifiable if this simplified the structure and the uses of packages together.
  • In looking at clickable GSIM and its Node and it does. Classification Item, Code Item, and Category Item. Look at the how this functions and with our collection pattern we do this differently. Biggest difference is the levels. How does this interrelate to ordering in Collections. Some may want to walk away from "Levels" by using 2 order relations one within and one between collections.
  • Methodology and Process pattern were already having attempts to use it which is an advantage.
  • This is an logical model that produces implementation.
  • Strict sequence is a good example of simplicity. Can we simplify where we simplify. We would have to look at the inheritance pattern. For example could inherit directly from Binary Relation as you would then not end up with a set of pairs.
  • When you look at StrictOrderRelation it is abstract. So a list can be created from the pairs and visa versa so we don't have to worry so much about the conceptual inheritance of it.
  • Its beautiful in definition in terms of its transitive, reflexive, symmetry
  • If the only reason for realizing an abstract relation is to restrict what it is ordering, is this a valid reason for creating a large number of new realized classes.
  • There are instances when you want to use pairs and instances when you don't. For example, partial orders or use of the Allen's Rules requires pairs. We want to be able to simple things simply and complex things clearly.
  • Being able to do temporal and spatial orderings is important and so we need the complexity to do these things clearly and the pairs is a good mechanism for that.

Questioned whether for a codebook which is retrospective (or at least descriptively focused) there is a need for multiple moods.

Oliver won't be available next week.

Primitive data types - send Oliver issue number

 Virtual meeting 2017-03-29

ATTENDEES: Wendy, Larry, Dan, Jay, Oliver, Jon
DMT-122, 121 - resolved
leave class ExternalMaterial identificable
move from Identifiable to AnnotatedIdentifiable
DMT-104 new doc, remaining issues; resolved (action issues in progress)
DMT-102, 96, 101, 114 prepare for week after NADDI
DMT-100, 97, 115 - in progress

NO MEETING NEXT WEEK - NADDI

 Virtual meeting 2017-03-22

ATTENDES: Wendy, Oliver, Larry, Jay
DMT-122 - set up for next week
DMT-119 - update
DMT-121 - set up for next week
DMT-16 - assigned to Oliver to review
DMT-72 - set up for next week
DMT-117 - set up for next week
DMT-67 - discussed issues and approach

 Virtual meeting 2017-03-15

ATTENDEES: Wendy, Jon, Dan G., Oliver

Update on Lion Server from Oliver:

  • Could be the possibility of getting Lion on the cloud very quickly. There is already a means of moving it up there.
  • Will try to get John Shepherdson for a time and try to get this set up in the cloud. If it doesn't work, then go to GESIS IT department for a short term without the political issues.
  • The cost shouldn't be over about $20/month (traffic would vary but should not really be high)

10 remaining D4Q2 issues

  • These can all be completed before or during the IASSIST Sprint
  • Any issues unresolved prior to the sprint should be well-structured and prepared in order to complete during sprint

DDI and GSIM

The following discussion regards the relationship between DDI and GSIM, what needs to be done to clarify the relationship and a possible work plan to address this. Related issues from DMT 

  • DMT-119 Relationship of Process Pattern in DDI with GSIM
  • DMT-66 Population time space and Unit
  • DMT-67 Implementation of Design Patterns
  • DMT-114 Review of Representation and its use in Statistical Classification View
  • DMT-18 Adoption of GSIM terminology

In terms of GSIM we seem to tie ourselves in knots when our class diagrams do not look the same between DDI and GSIM. Relationship between these need to be in terms of functionality, not replicating the model. A more flexible way of conforming to GSIM. The GSIM model provides the requirements but DDI needs to model the functionality. We need to look at this, write it down, and describe the differences. This requires documentation at a variety of levels. The functional requirements of GSIM and if you have an interface that implements this will DDI provide the same functions? That is the criteria. It would be difficult to go through both models and check that. Could we devote a week to this form of comparison with about 4 people. Does DDI have the functionality to support this.

How to approach this.
1 - idea of taking a group at Dagstuhl to do this would be great, but it would have to be a dedicated group without distractions
2 - splitting into parts makes sense because GSIM has discrete parts and scheduling a day of dedicated work with meetings. We are looking at GSIM and finding stuff in DDI. We could then claim that DDI is an ISO 10000 (technical report) implementation of GSIM (one standard or spec of another if it conforms to that standard if it satisfies the requirements). If we can actually prove we satisfy that requirement it is all we have to say
Spread it out with 3 hour meeting times over the course of several months. Anyone with knowledge of both systems. Need people who know GSIM and people who know DDI. People who worked on various aspects of GSIM: Dan G. conceptual group, J Gager, Arofan, Flavio, Rob McKlelland(?) stats can, Alistair Hamilton, Franck Cotton, Guillaume

ACTION: Dan will check on others who were engaged in GSIM in each of those groups

Can we do the same things in DDI as we can do in GSIM?
The summer sounds like a good idea but we need to lay out the work plan for this carefully. Do we want to involve externals in this or not. Part of this is where HLG stands. If we use the 2 phase approach getting the GSIM people in early we could possibly use Dagstuhl to wrap it. We need to be able to say that we conform. We cannot do this without this type of detailed look. The fact there will be tools made for this specification means that doing this would be an incredible step up. CSPA logical model
HLG is only looking from the point of view of the national statistical offices but their own work is being done by people outside of their own limited community. The next HLG meeting is in November. If we could prove conformity over the summer we could contact HLG.
We have the capability and Dan can go to the HLG meeting. This work would put us in a very good position.

ACTION: Wendy will draft up a work plan, identifying any gaps in DDI we need to get done ahead of this work. We're at the point of proving things work the way we intend them to and making sure we are efficient. We need a work plan that is clear, well-structured, and achievable. Needs to be outcome driven. GSIM: 4 main groups and a underlying infrastructure. We have the same thing in DDI. It should be doable. They could work concurrently and not have to interact much.

 Virtual meeting 2017-03-08

ATTENDEES: Wendy, Larry, Jay
Inheritance issues from codebook:

  • Inherited relations that point to different classes they should clearly identify that they overwrite or use a different relationship name
  • Instance question does not inherit from represented question forcing you to have both - file with DCAT

Discussion of DMT-67

What Jay was meaning to do was to first look at a larger problem and solution to look at some of the more specific problems. Looking at an approach.

Need to look at the changes that Flavio made and that some of the work Arofan did may be less applicable. Based on the minutes of the last meeting several issues were raised about some realization packages in the context of Codebook and other use cases. By creating the MethodologyOverview we also created a kind of Pandora's Box. The larger question of how people are going to use this model has been raised. Methodology is rather critical to everything and so how do you get enough "specifics" without having a general purpose realization.

There are people who are interested in higher level descriptions and people who are interested in more specific description or machine actionable content. How to use thing for just "overview". How to simplify (reduce content). As you design something you often start from very general and then flesh it out. Similar to extending from Codebook to Lifecycle. They want to do a Lifecycle thing with specific nodes.

Concerned about us ever being able to cover all types of methodologies. This is why we created the MethodologyOverview. Dan's stitch that could point to any detail.

Maybe it shouldn't be a Methodology realization but a simple overview and how you relate to any extension. You may have to create abstract extension heads.

May need to explore the production issue so that we could extend from pattern abstracts. Or create some none pattern abstract or non-abstract extension bases.

If these things are connectors we should think about whether they need to realize a pattern and just a level of generics layered in like a summary description level that could be used at the design phase of a study. We would also look at whether this would also act as a retrospective summary. What is the impact on Bindings that allowed you to look at a process in real time, retrospectively. [Planning to do (i.e. this is the process that is to be applied); applied use of process (current inputs and outputs); runtime variations; retrospective summary]. Similar to CDISK model (planning, scheduling, ...) Begin with something that is conceptual and then extend it.

Jay could layout the four pillars from CDISK Bridge model and then apply that to our methodology/process model and then see if we could take the same approach.
One of the things you have to decide is where Process belongs. Methodology or Process Pattern? Should these be put together to form a continuous pattern.
Address the issue of being able to use the same content in different views (slices of the model library)
As work gets done post it on DMT-67.

 Virtual meeting 2017-03-01

ATTENDEES: Wendy, Jon, Jay, Dan G., Oliver, Larry
DMT-105

  • Document needs final comment (will request via list)
  • Discussed locations where relationship targets are pattern classes as this is not allowed in rules
  • Raised the question of how to address the ability to link basic or descriptive or overview realizations to more specific pattern based classes as a specific extention of the metadata (adding later additional detail within a view or to detailed metadata external to the view).

Both specific issues and general approaches were discussed. The following are comments from the discussion in the order they appeared. They have been labeled to facilitate the work of DMT-120 in preparation for the IASSIST Sprint

  • Specific issues:
  • Do we need a high level ProcessOverview similar to that done for Methodology?

General Issue Description:

  • We have a lot of the mechanics in place regarding patterns
  • We've said that 105 is related to 67 where we want to describe how to use patterns to do more specific things
  • This document should come before the actual gap analysis and work
  • Jay is hoping to get to 67 this weekend - can go on agenda for next week

Possible approach:

  • Use of some of the overview types as extension bases?
  • If we do that we could be giving people too many options of doing it the wrong way
  • We want them to describe them in the same way therefore specific realizations of patterns
  • Issues of semantic interoperability

Use cases:

  • Scenario where we want to take basic metadata and extend it

Possible approach:

  • We've thought about a controlled vocabulary for the high level descriptions of methodology - to identify which kind of description this is like a "subject" of methodology
  • One way would be when we have a specific model that expands on a specific methodology - add a pointer to a specific instance of a detailed methodology

Specific issue:

  • What is the role of external material? Can we separate the pointer from external to internal. Should external be a pointer to non-DDI and therefore be renamed?
  • Rather have more precision in terms of the target of the pointer DDI-Views model content, other versions of DDI should be considered external with examples of how to specify that the information is in a specific format other than that of the instance (i.e. DDI-Views)

Future work:

  • Prepare some use cases of these situations for Codebook, taking that basic information and extending into a different view which provides depth, or other specifics, can we provide a reference from a more general one to a more detailed one. Also layout a number of approaches to test use cases against. Do for IASSIST Sprint.
  • We have some ideas but need to fill them out.

General Issue Description:

  • With the high level descriptions we created kind of an island and can't put things together in a continuous way. We need to go back to how we did what we did that doesn't create an island.

Ideas for unification:

  • Adding a specific type pointers - how to avoid malicious compliance.
  • Use of non-pattern extension bases - yet avoid chaos
  • A typing way of providing the specific link (i.e. if type="Sampling" target must be "SamplingPattern")


TO DO: File an issue as prep for IASSIST Sprint to get issue described, ideas for unification, and use cases so we can validate approaches against use cases, determine approach and fine tune at sprint. - DMT-120
TO DO: Create JIRA issue regarding documentation and possible renaming of ExternalMaterial DMT-121
TO DO: Set up another Tools Group meeting (Wendy will talk to Oliver about agenda items, what's needed)

FUTURE AGENDAS:

  • DAN will try to finish his items for 2 weeks. (Probably unable to attend next week)
  • WENDY will get some things done for next week.
  • JAY will try to have DMT-67 prepared for next week.

Agenda item for IASSIST Sprint for MT:

  • Qualitative - general review of modeling structure to update to current modeling decisions, this is a modeling review not as much a business content review
  • Unification from pattern to general realizations to specific realizations
 Virtual meeting 2017-02-22

ATTENDEES: Wendy, Jay, Oliver, Larry, Dan G., Jon
Decision that when the active work of the MT on an issue is complete and a task moved to another group. MT can close issue.

  • Updated 108 and 2 to closed

Reviewed document for DMT-105, this will be revised for terminology. A number of consistency review issues were noted.
Task for members:

  • Select an issue from the In Progress list and prepare background information and recommendation to advance resolution of issue.
  • Dan already has several he is working on.
  • Jay assigned himself DMT-67
 Virtual meeting 2017-02-15

ATTENDEES: Wendy, Larry, Dan G., Oliver, Jay

Server issues:
VPN server for Lion - Oliver will be looking into this
No builds for 2 months due to switch - Oliver will pursue
Currently don't have a running system in Google Cloud

Reviewed issues currently being worked on (listed below and updated where we are)
114 - approach determined, need to implement
115 - assigned to DDI4DATA
97 - Dan G and Larry - justification of the variable cascade noted in Data Description group
104 - wlt draft
100 - Dan G. - justification of the variable cascade noted in Data Description group; add 3.3 documentation on Value Domain/response domain/representation (wlt) - need to work with Barry and DCAP group
116 - wlt draft
107 - resolved, entered
117 - wlt draft

Focused Reviews:
How do we test out model revisions between formal reviews? We currently have 2 DMT-102 and DMT-96 where this would be helpful. Goal is to be more iterative with small, internal, focused reviews.
Discussion:
Might be a good task for someone like Alexander Mühlbauer who offered some time, could also make this offer more broadly to the community to pull together small internal review community. Create focused use cases and also a means of getting use cases created. Could we create a groups with areas of interest. Will raise with AG as a overall development approach.

How to handle issues ready for entry review:
Assign review to a person - randomly, ask for response in a week, make comments on the issue
Jay - DMT-107 is use of xs:anyURI consisent?
Oliver - DMT-108 Cardinality of 2..2 does not appear in XMI
Jon -

  • DMT-111 StructuredString has Content which has content - confusing
  • DMT-76 Class String should be renamed
  • DMT-79 Harmonise models of InternationalString and StructuredString


NEXT WEEK:
DMT-117
DMT-116
DMT-104
Selecting future issues to address (spreadsheet link)
We are addressing those identified for Codebook release. The attached spreadsheet lists DMT issues sorted by status. Those In Progress which are highlighted in yellow are those we are currently actively working on. We need to identify others currently In Process (we've commented on all of these) should be addressed next. Criteria? Low hanging fruit, urgency, ability to address in a short time period, what makes sense to you. Send your top 2-5 to the list or bring to meeting.

 Virtual meeting 2017-02-08

ATTENDEES: Wendy, Oliver, Jon, Dan G. Jay, Larry
DMT-104 Clarification of usage of annotation
Annotation is not clear about its usage in relation to the contents of the object it is annotating. What is the intended use of Annotation content in terms of clashes in date, name, version and other duplicative information? The documentation on the purpose and use of Annotation needs to be clarified.
Need rules for usage. Send to MT for action.
The solution should NOT be to put it everywhere (duplication of content). Documentation should be done in depth for particular properties which duplicate information also found in the object (name, version date, etc.).
DMT-117 Is there a conflict in collection relationships with the capabilities of annotation
Annotation supports a number of relationships between object (those supported by Dublin Core) such as hasPart, isPartOf, etc. Does this conflict with the collection pattern and Binary/NaryRelation? This came up in a discussion of  D4Q2-65 Closed and TC thinks it should be discussed and documented by the MT.

Discussion

  • The Fundamental question is what do we need to annotate
  • Sometimes a metadata document will have some of the elements we include in an annotation.
  • Who, when, why:
    • for the creation of the metadata documents and
    • For the creation of the metadata instance of a class and
    • For documenting instances of data, studies, or view
  • Prevent infinitely regressive annotation (documenting something at multiple levels of recursion)
  • If we have that capability in the model can we prevent "this" from happening
  • We have created an infinite loop in the model for annotation
  • What we have right now are classes that are annotated identifiable - that annotation is about the instance of that class
  • There are also a number of other classes that use Annotation other than as an inheritance from Annotated Identifiable
  • Need the ability to annotate an instance of a document as an edition of the metadata separate from the creator of the intellectual content (like the difference between section 1 and section 2 of Codebook)
  • The need to annotate a variety of things
    • document creation of an conceptual instrument
    • Someone else could use that conceptual instrument and you want to document the packaging of the conceptual instrument - you are shipping it out in another binding (XML, RDF, or a packaging of information)
    • Permanent and transient instances - publishing for distribution
  • On one hand anything can be annotated that contains a set of metadata and would carry an annotation
  • Creation of a DDI document the root not could contain document information so we could bring an annotation to every container element
  • Shift from Identifiable class Annotation it is now a complex data type which solves one of the problems initially seen in creating DocumentInformation - that of having a stack of unassociated Annotations with links to various places
  • The context of the annotation is now clearer because its a complex data type
  • We need the documentation to be clearer about the instance of the class or the metadata document and give examples in the description of annotation
  • This would be a good way to refer to Dublin Core in general, correspondence with DC. Example could include reuse of something
  • Can this be used to support a provenance chain of source?
  • There is a whole document on this from Dagstuhl work and this should be referenced from the document. Some should be added to our documentation in parts. This is published in IQ
  • Review this for any specific content for High level documentation.
  • There is an "official title of an annotation" corresponds more to the Dublin Core elements and purpose of resource location, its means of assigning credit so a title alone is meaningless
  • Name is/can be a machine actionable context specific means of denotation
  • Would an instance variable have a Name "linguistic signifier" Title has "full authoritative title"


DMT-117

  • Its a parallel issue. The Collection relationships are machine actionable regarding relationships between items in a DDI collection whereas the context of the relation information in annotation is far reaching. The relationships in Catalog are much more structural and imply certain actionability
  • Different types of annotation are appropriate in different contexts. Annotation properties should be thought of within the full relational context of annotation. Individual properties may have similar classes in DDI but are used independently in other contexts.
  • Annotation should not be used to express similar uses of a specific class that parallels a property in Annotation.

NEXT WEEK:

ACTION: Wendy will summarize above discussion and provide text for various uses. These will reviewed next week for agreement and specific actions
Prepare FROM 2017-02-01
DMT-116 - discuss to clarify intent and realization - wlt draft up and review where changes would be need (related issues)

 Virtual meeting 2017-02-01

ATTENDEES: Wendy, Jon, Larry, Oliver, Dan G.
What is specific to codebook (what needs to be done in order to complete):
DMT-97 [can someone volunteer to frame - suggested definition/description, points where clarification is needed] - Dan and Larry will take a shot at this and run text past Jon for clarity review
DMT-100 [may just need documentation] - linked to DMT-115
DMT-115 - its a bit of a Swiss army knife; meant to be more compatible with ISO 11404; - Assign task to Data Description to describe 3.2 range of variable representations in DDI Views as documented example
DMT-116 - discuss to clarify intent and realization - wlt draft up and review where changes would be need (related issues)
DMT-107 - resolved - entered
METH issue resolved, enter, and review
Next Week DMT-104

 Virtual meeting 2017-01-25

ATTENDEES: Larry, Dan G., Jay

Reviewed the proposed Descriptive Methodology. Recommended name change. Added Issue to MWG

 Virtual meeting 2017-01-18

ATTENDEES: Wendy, Larry, Jay
Discussion of Codebook issues raised in their last meeting:

  • You have to have something that realizes a class in a pattern in order to make it usable
  • There is a generic realization that can be used for multiple purposes but may be something that is a light weight
  • Does this mean that we need a realized descriptive methodology
  • Dan had some use cases where there would be no algorithm, design, etc.
  • We are still experimenting where the balance is between Pattern (totally generic), Specific Realization (specific), and General Realization (Workflows, multiuse)
  • Balance with expression in Functional View
  • So we need to get this out and see what works
  • Use of classes in Functional Views you can cut off the inclusion of say the Process that this points too.

ACTION:

  • Wendy will draft up a generic descriptive methodology realization and send it along.
  • Send out annotated list of issues possibly related to Codebook to MT and also post on CWG for comment and ranking

Background information from emails below:

Where does Embargo go?  (one proposal: in Annotation)
               Or Embargo relationship from InstanceVariable
Where should security information go? (one proposal: in Annotation)
               Access from InstanceVariable

Make a Methodology realization   to use overview for:
               TimeMethodology  (0..1)
               SamplingMethodology  (0..1)
               DataCollectMethodology (0..1)
… (more)
               OtherMethodology  (0..n)

InstanceVariable (ConceptualVariable?) points to            
               derivationMethodology


---- 
Need Methodology at the high level - do we need all of the classes it relates to as well?
For derivation we need to add Design and Algorithm

General question for modeling- how do you add in just part of a pattern?
Methodology realizations sound interesting.

Is there a StudyMethodology? When I look at http://lion.ddialliance.org/ddiobjects/study , it “hasDesign” and “hasProcess” currently. If we had a StudyMethodology, couldn’t Study “hasStudyMethodology”. In StudyMethodology we could subsume Wendy’s “Group” and achieve backward compatibility.

I am a little wary about using Annotation as a home for security information. Potentially, security is as rich as Sampling. One way or another it may need its own object(s).
----
One of the possible problems is what happens when you invoke the methodology pattern. If you include the methodology class, then you have to take everything else in the pattern as well. Isn’t this right? That is a lot of overhead for codebook when all you want is a simple high level description. This is one of the big issues, though we may not understand how to use patterns.
----
I’m unclear if the targets of hasDesign (Design) and hasProcess (Process) are correct in that they are pattern classes.
----
Patterns can point to other pattern classes. When you realize the pattern you would need to create a realization of the class that it points to if you plan on using that. For example a Design has a relationship to a Process but if you don't need a Process you can leave it as a relationship to "any realized Process" by the end user. In most cases you would want to create a realization that allows you to either do a basic description (by not including a class in a Functional View) or provide full detail.

The alternative would be, in the case of Methodology Pattern, to drop all relationships to Process and provide documentation on where someone who wants to include Process should add a relationship in the realization.

We have gone back and forth on this which is probably a source of much of the confusion. Initially we wanted to have simple classes that extended to allow for complex content. The problem was that in an instantiation you couldn't just "add on" additional detail, you had to create a more detailed extension of the simple class. Therefore we decided to model the detail and allow you to restrict the amount of detail in the Functional View (so for example a SamplingDesign that did not support the use of a SamplingProcess within the Functional View).

We will need to structure this discussion tomorrow to clarify where we need clear instructions and where we have gaps.
----
Dan, based on Wendy’s explanation -- which I think is in line with examples Flavio constructed with a questionnaire friendly Workflows built using elements of the ProcessPattern, I think we can, just as you have advocated, pick and choose with Methodology. We can approach Design, Algorithm and Process, including some classes and not others as needed.

However, until I read Wendy’s account, I hadn’t realized how we had, using patterns, solved this problem that had dogged us when we specialized/extended. Very interesting.
----
So, for the most part, Codebook should adopt the idea of just incorporating Methodology as a class to describe each: sampling, editing, collection, etc. It gets somewhat complicated when we think about derived variables. There, you need to know more, and yet the methodology pattern is exactly what is needed. Can we adopt different percentages of the pattern in different parts of the Codebook? That seems to be what is needed.
----
Methodology as a class cannot be added to a Functional View. You need to have a realization of that class that can be specific to a business activity/topic or more general covering an identifiable range of things (for example Workflow can be used for an Instrument or for data processing etc.).
The realizations of the pattern should be created to fit the need. Note that if say you created a methodology for derivations it could at the point of the Process pick up and use the existing WorkflowProcess which would make all of that structure available for use. This is just an example, don't know if it makes sense to use this in this case.
----

 Virtual meeting 2017-01-11

ATTENDEES: Wendy, Jon, Oliver, Jay, Larry, Flavio
Short discussion between Jon and Oliver regarding .dot rendering issue from D4Q2
Category Set and Category Item need to see where this one is...connect to DMT0114
DMT-114 - see comments in JIRA
DMT-97 Compatibility or capability - get background on this


What is specific to codebook (what needs to be done in order to complete):
DMT-114
DMT-115
DMT-104
DMT-97
DMT-100 [may just need documentation]


  • There is a kind of pattern across this issue where the concern is that because of the way we introduce the variable cascade they don't see how something is supported or how things work together and reflect the capabilities of DDI-Life/Codebook. We were more concrete in earlier versions and we need to assure people that the things they are not finding are there. If they really aren't there we need to address that but first we need to be sure we relay how they conceptual approach translates into implementation of something they are familiar with. Goes back to discussion of platform specific model, we need to keep it as simple as possible and as close to what is expected as possible but we still maintain the abstraction for reuse and conceptual understanding.
  • For example the notion of a root in XML and how that its implemented or represented in the model. We need to find the sweet spot.
  • Find a good example where we think this is happening so we can find a way of mapping it in a consistent manner. We need to find something like the pattern where we can say X items using a similar transformation.
  • Why don't we use the collection pattern...maybe Data Capture and Agent.
  • When in Norway tried to do that by recreating an earlier version of a codebook and the only way we succeeded was by cheating.

ACTION:

  • Write up charge for group working on DMT-114 review and then contact individuals to see if they are willing to work on this
  • Do background work on items identified as needed for Codebook
  • Prioritize DMT issues (first pass)

NOTE:

Flavio will be unavailable until later in the spring

 Virtual meeting 2017-01-04
ATTENDEES: Wendy, Dan S., Flavio, Oliver, Jon, Larry, Guillaume, Barry, Jay
AGENDA: DMT-102, DMT-96 (detail notes are on DMT-102)
  • Drop the idea of generic as that is the purpose of the pattern. Realizations need to have specific use.
  • This is an iterative process and we can expect to go back and forth between the patterns and realizations.
  • Make sure the workflow is usable by data capture
  • May reuse parts of these realizations to in other realizations (such as binding, control construct subtypes)
  • Wendy will do Process, Flavio will do rest.
 Virtual meeting 2016-12-21
ATTENDEES: Wendy, Flavio, Jay, Larry
List issues we need to talk about with Flavio
CategoryItem and CodeItem solution - look for this (Dan G. conversation) - related to signification pattern
Linked DMT-102 and DMT-96 comment from Jay regarding Workflow
See DMT-96 for discussion
Relook at level of relations between patterns (extension of classes between patterns)
Review need of identifier in realization of pattern where an object is not identifiable (leave as associations until review is done)
Patterns vs. Realizations and the use of generic realizations
* At the Workflow level - any issues where there is daylight between what is in the realized and what is needed by Instrument we need to remove the daylight - for example make an InstrumentFlow rather than a generic Workflow
* Really mitigates against have "generic" realizations and having specific use realization (collection is an example of this because the realizations are use specific
How to deal with the second workflow type by starting an Instrument Flow package and allowing Workflow to remain more generic
ACTION: Flavio will get the binding piece into Drupal so that we can update the FV so it can be tested.
 Virtual meeting 2016-11-23
ATTENDEES: Wendy, Dan S., Oliver, Larry, Guillaume, Flavio
DMT-108 entered in Drupal GitHub as Issue 61
DMT-80
  • 20931 lines for Codebook -- 66% are Annotation, many Controlled Vocabularies of 9 lines each
  • Need to look at this before the Codebook View goes out
  • Oliver and Larry will add some specifics as comments to this issue
DMT-96 - notes also added to issue
Clarify what this is trying to accomplish
Dan S. -
  • A composition of a bunch of reusable items and connect them in some way that can be specific to a usage
  • Parameters defining the inputs and outputs of a workflow step
  • Bindings connect inputs/output to/from inputs/outputs
  • We need to be able to use sequences and subsequences including all of their bindings
  • Problems: hasParamters (Parameters) 0..1
  • The idea of sharing an interface is not correct - it needs to be bound to different things and is therefore not reused
  • Could just have input and output parameters associated with a workflow set
  • Fails to allow unique parameters that are unique to a specific workflow set
  • The idea of reusing the same in parameter makes sense in terms of binding to multiple locations but the inputParameter is an ID associated to a single object
  • You could have a more generic input
  • The notion of an inputParameter and outputParameter could be restricted in the workflow rather than the pattern
  • InputParameter is doing too many things here
  • Workflow was a generic implementation of the ProcessPattern
  • At the ProcessPattern reuse might make sense but not at the concrete implementations
  • Perhaps the Interfaces need to be separated from the input output parameters
  • Making it the use of a defined parameter
  • Input Output at the process pattern it is conceptual and then more realized
  • Input Output should be tightly bound to their containing workflow step to fulfill their role in binding
  • The notion of reusing interfaces is valid in the process model but may needed -
  • Parameters might be adding an unnecessary level
  • What is missing?
  • Should Workflow be constrained to support Questionnaire needs or should it be left as a generic and create a separate one with more restrictions?
  • Input and Output Parameters are specific things that are identifiable. How does it change from being an input parameter to an output parameter.
  • At the modeling level there might be an indirection; InputParameter is a typed container (a space in memory you reserve for specified content which change at run-time)
  • How is it possible to have many different value domains?
  • If you have a workflow step that has a specific inputparameter it is an identified container.
  • Don't understand what its containing.
  • At design time you are connecting the pipes and at the run time it is the specific content that runs through it
Binding
  • InputParameter and OutputParameter both contain the same information so why do we differentiate and need to be modified to work with the binding
  • Binding say they have input of inputParameter and output of outputParameter - should be source to target because its not always an input to output
  • If source and target then target object should be parameter
  • What does a binding mean that has a parent as well - not reused - should not be identified and should just be pair-wise relationships between bound parameters
  • You don't have many to many in all situations, but a 1-to-many is reasonable. If we want only pair and then multiple bindings that is a decision. The one in the pattern is the most generic.
  • What is the purpose of having the bindings be identified. If bindings are only used within a specific scope then there is no purpose for identification.
  • We need to revise in the context of how much we want to/can reuse.
General question in point. Was thinking that Workflows implementation should be the basis of all of the processes we have. Everyone will have to create their own conditional constructs. Workflows is the base of what can be extended for different usages. There are a lot of things in there that deal with more than just the workflow aspects. Services, precondition, design, algorithm used, etc. Focus on Information Flow, control features, etc. i.e. a question doesn't have a process framework. Or instructional command, command code, etc.
Act in Methodology might be "we did a pretest" which don't need an instructional command.
Very concerned that process is becoming so generic
ACTION:
Flavio will rework the bindings and parameters based on this discussion
Wendy will review workflow for making it leaner and more focused
 Virtual meeting 2016-11-16
ATTENDEES: Wendy, Larry, Oliver, Dan G., Flavio, Jon
Larry:
Manual template for an instance variable in YAML then used his python program to produce a YAML template for each view
One of the issues is how to represent repeating elements in YAML
Larry will share when its complete
That becomes a definition of a JSON schema so it shouldn't be hard to produce a JSON output/binding
DMT-100 Dan will talk to Barry et al to get more information
DMT-102, 96 Flavio will try to have ready for next week; contact Dan Smith to join if this is on agenda
DMT-97
  • One path to a solution is the use cases we've been working on. If we can show how things are divided up it will become clear what and "instance variable" is.
  • Will need some explanatory text in the front to clarify.
  • We can talk about split between physical and logical manifestation of a variable based on the use cases.
  • The ValueMap adds another layer below the instance variable that is a lot more physical.
  • Goal is to get a short document to clarify the roles of Variables: Conceptual, Represented, Logical, Physical using the use cases
DMT-101 Yes its wrong - Flavio has been assigned
DMT-99 Larry is working on
DMT-98 done
Added DMT-109 regarding clear definition of Functional Views
DMT-108 Oliver will put on Drupal issue list
DMT-107 Wendy will provide background
DMT-105 Wendy will try to have for next week
NEXT WEEK AGENDA: DMT-102, 96
 Virtual meeting 2016-11-09
ATTENDEES: Wendy, Jon, Larry, Oliver, Dan G., Flavio
DOC: Short discussion of templates for Use Cases - problems in terms of size and usability
NodePad++ - YAML editor https://notepad-plus-plus.org/
JSON also a possibility but its easier to add comments in YAML
DMT-103 resolved: primary issue referred to DOC-12; added issue DMT-105 review of abstract class usage and content
DMT-95 closed
DMT-102 went over with Flavio, he will create the two possible approaches to revising workflow; Wendy will draft up review of Abstracts and rules for use outside of Patterns
 Virtual meeting 2016-11-02
ATTENDEES: Wendy, Larry, Dan G., Oliver, Flavio
Went through new issues from Dagstuhl. Priority is DMT-102 and DMT-96. See JIRA issues for details.
DMT-102
Get Flavio and Jay to look at workflow issues and come back with a recommendation
DMT-96
Binding set up meeting - Input/Output/Binding
Flavio will begin discussion with Dan Smith to clarify what is needed, what didn't get into Drupal.
Dan Smith should be involved in this discussion. Send out note to small group to get this started.

DMT-103
Pattern issues - Oliver will create a list using XSLT and Wendy will do a first past for review
DMT-95
Check with Barry about meeting times. Oliver/Wendy
DMT-98
Wendy will do it
Next Weeks Agenda
Flavio should have some content for DMT-102 or DMT-96
 Virtual meeting 2016-10-05
ATTENDEES: Wendy, Larry, Jon, Dan G., Oliver
Finalizing package for TC:
  • Jon has 3 paragraphs to put in and few last minute things to do
  • Documentation will be ready within the hour
  • Select include packages in Drupal
  • Push a build into bitbucket
  • Get build number and process on the pipeline
  • We are done tweeking with Drupal
  • Send out note to stop work on Drupal
  • List is on spreadsheet
  • Reviewed list of open issues and modified to clarify. The DD set have been set to Steve as they were updated.
  • Look at documentation for any glaring issues; high level documents and glossary
  • http://ddi-views.readthedocs.io
  • Do things need to be limited in review package such as the RDF folders.
  • Need to list what doesn't match what is in the build for those who go back and pull directly. (this should be noted on release page)
 Virtual meeting 2016-09-28
ATTENDEES: Wendy, Larry, Jon, Dan, Jay
Reviewing what needs to get done to send TC
  • Flavio documentation is looking really good but could still use some additional clarification material
  • Issue DMT-91 Larry will write up issue for inclusion in review guide
  • Dan will see what he can get today on documentation issues and let Wendy know what can and cannot be completed for reassignment if necessary
  • Instrument View - OK
  • ConceptParentChild and ConceptPartWhole need to be noted as needing correction and updating to reflect new relationship patterns for consistency across the model.
  • Align what is going out and the package names on Q2 review page and project management page- Jon reviewing
  • Note on MethodologyPattern and Methodologies are going out without a realization - still under development
  • Need to push to Sphinx - DDIViewsQ2 don't know - we can put in a specific build
 Virtual meeting 2016-09-21
ATTENDEES: Wendy, Larry, Oliver, Flavio, Dan G.
Quick question:
  • Physical Layout contains ValueMapping and the cardinalities seem backwards - check cardinalities
  • It is wrong so Flavio will fix
  • May need to move SegmentByText and parent may need to move to a separate package
  • Some relationship that should be easy but isn't
  • InstanceVariables and associate those with the question text or question. InstanceQuestion and InstanceVariable
  • Codebook is going out after Q2 so make note of it in Data Capture and InstanceQuestion
  • If InstanceQuestion inherits from RepresentedQuestion (like variable cascade) it would solve this issue and also list it as an unresolved issues that should have further discussion
  • Some concern about inheritance as not seeing what exactly is inherited
  • What you gain is not having to have the RepresentedQuestion as well as IntanceQuestion
  • Regarding the direction of the link between InstanceQuestion and InstanceVariable
Methodology in or out?
  • there is still development work being done
  • Implementations would be ready for Codebook and presented in context
  • We don't want to put out garbage
  • We don't have a solid implementation but could put out the pattern in the documentation
  • Did the Methodology group agree that the pattern was OK?
  • We do seem to be happy with the pattern in general -
  • Identified a general framework for describing any Methodology (Design, Plan for implementation, and a way that it is actually implemented) There are inputs and outputs and they are tied together. It can also be used to describe a process in steps of detail (high level to specific (use of multiple methodologies to describe a complex system from broad to granular level. The Methodology is found in documentation as we don't have a realization of it. There will be several in the Codebook release for example Sampling, data capture, data cleaning,


ACTION: We need a roadmap - Notify AG Make them a Jira task list

Patterns Document:
  • Continue to use the term DDI-Views
  • Includes only those with realizations - ie does not include Methodology
  • Updated the based on last weeks discussion
  • Covers: Collection, Process, Signification
  • References to other documents:
  • Documents on relations from Dan -- update and send to Flavio
  • Document on signifiers (already published) send 3 together to Jon to determine publication
Reviewed DMT-89
  • Larry will do Qualitative
  • Dan will do Representation
  • Conceptual has greater problem of CondeptParentChild and ConceptPartWhole - need to update to reflect current set of relationships - Dan Flavio
  • Methodologies - Wendy
  • Remove ConceptParentChild and ConceptParentChildPair and consider renatime ParWhole
(Make it so)
 Virtual meeting 2016-09-14
ATTENDEES: Wendy, Dan G., Flavio, Larry, Jay
Flavio: Process Document
  • Updates have been made, colors aligned
  • Where are we in terms of meeting timeline
Asymmetric Relations:
  • Move blue classes to the right - do we need the general class
  • the correspondences are not different from the asymmetric relations
  • Understanding a connection to the correspondence class would not be obvious
  • We reduce usability if we get rid of the general term - may want to keep the more generic one in the pattern
  • We've realized patterns in various ways
  • Do we need both in the pattern
  • Collection correspondence, is that an asymmetric relation in all its glory or is it a specialized case
  • The way it is right now is as an asymmetric relation. Right now both are in the pattern.
  • This sounds like the right approach
  • Replace the asymmetric / symmetric relation with a collection correspondence relation and rename it. Maps goes away from the pattern.
  • Everything pointing to collection correspondence stays that way and move collection correspondence out of
  • Blue stuff disappears and have exactly what you see in red. Add Maps to correspondence table (Maps realizes structures) which will now be called Maps
  • Collections pattern will then be done
Process Pattern:
  • pattern then realization of workflow etc.
  • We have in workflow two ways to work with order
  • Sequence Order (realization of strict order relation)
  • SequenceOrderPair and also be ordered by temporal relations
  • PrecedesIntervalRelation also a strict order relation so equivalent of SequenceOrder
  • The first has no temporal notion and the second has the temporal dimension explicitly
  • SequenceOrder is redundant and so should be removed.
  • These are workflow steps when have a time dimension
  • Cases outside of this where sequence is not temporal (alphabet, numeric, geographic etc.) you would still realize a StrictOrderRelation)
  • Gets people looking at the temporal relationships so that they can use the most appropriate
  • Need to document the use workflow or other time related models on how and when to use the temporal relationships in complex process (end user documentation for person entering metadata)
Other:
  • Methodology has not been added to document - waiting to see how review shakes out - simpler model
  • Signification Pattern - add the documentation that was prepared
  • Binding has been added showing realization of interface - has examples
  • There is a lot in this paper which could be a good academic paper. Should think on this further.
Jay's review of DD - he seems happy with what he is seeing; just need to note coverage etc.
Layering on Run time - not committed for Q2
Review DMT-89
NEXT WEEKS AGENDA:
DMT-90 is on Codebook (lower priority - does not effect Q2)
 Virtual meeting 2016-09-07

ATTENDEES: Arofan, Wendy, Larry, Dan G., Oliver, Flavio

AGENDA:

  • Timing of review
  • Patterns
  • Data Description 

Timing of review

  • Looks like we should be able to wrap up content by next week then spend a week doing clean-up and packaging to get to TC by 2016-09-23
  • This gives TC a week to finalize associated review materials and put out for review

Patterns

  • Getting patterns finalized
  • Falvio has been updating documentation in Drupal
  • Updating Pattern documentation needs to be modified (finish by next Wednesday) - May not need a simplified user level documentation is not a show stopper

Data Description

  • Modeling work still needs to be done - they are meeting on Thursday
  • There is a new class between an Instance Variable and Physical Layout that needs to be worked out
  • Larry will try and put in content for tomorrow's meeting in Lion
  • Recommended approach of MT: These are important questions to get right, we flag this as not having been reviewed by MT Mostly adding properties
  • Could possibly review in MT next week

Questions for content group chairs and MT to be sent out by TC:

  • What is still under discussion?
  • What is liable to change during further development?
  • What decisions or structures would you like review and comment on?

Process Pattern

Service, Input/Output, Workflow Steps Design time and run time

  • How Services relate to process steps?
  • GSIM has a bidirectional Process Step performs Business Service and Business Service uses Process Step (not sure this is valid UML) GSIM models business services at two levels -  process is high level
  • Other than the high level business process and service we don't change a lot
  • The service is effectively associated with a process step - the purpose is to be able to link it to an agent and provide more detail Interface in the service is just a controlled vocabulary, in Process Step it is an input/output linking
  • The solution in the model would be to rename it to Process Step Interface and have a Service Interface
  • What we said last week is that what is here is a design time Workflow.
  • When we get to run time we create specific instances of that type.
  • You will have instances and parameters and at run time you link parameters to parameters.
  • We can put in the Binding Instance package
  • We have not covered historical - we want to have the additional information layered on so we want an instance binding, times, agent specification, problems, alternate paths etc. Service interface and mapping to process interface Get rid of interface property of service
  • NOTE: we have this additional implementation (no done yet) of documenting process as it occur
  • Remove processPeriod from Service Leave estimatedDuration in WorkflowService
  • A larger set of properties need to be listed and discussed which need to be added when the pattern is realized for different purposes

NEXT WEEK AGENDA:

  • Check in that we are making the progress we need to meet timeline
  • Members should look at items in DMT-89 and correct or comment on in order to get documentation cleaned up
  • Might be able to take on additional modeling for historical process
 Virtual meeting 2016-08-31
ATTENDEES: Arofan, Dan G., Flavio, Oliver, Jon, Larry                              
AGENDA: Review of the Process Pattern and Workflow object properties.
Process Pattern: Service object, processPeriod property.
- Decision: The Service object in the Process Pattern will not talk about time (processPeriod property). This will be moved to more concrete classes. There are three applications of time: Effective time (Availability), Historic time, and Expected Time. Look at estimatedDuiration in the WorkflowService object.
- Service inputs and outputs? Service does not relate correctly to Process Step. Leave as an open issue. Service needs a more complete review, but we should wait until we see how people implement it (regarding what properties they need.) There are also issues with the relationship between process step and service. The process step references the service - is the directionality right? And how does the interface property inform this relationship? Should location and interface be controlled vocabularies? Is a Service actually an Agent? Put these questions on teh agenda for next week. Review the definitions.
Workflow: Workflow Step. We seemed OK with the existing property set.
Workflow: Binding. Remove usage property. If anyone needs it, they will tell us.
Workflow: Input and Output Parameters. Need a DDIObjectType property (controlled vocabulary). This is not used for specific instances, but we may add a run-time parameter. This is for design time. Value will be the type of any DDI object (including abstract classes and those found in patterns). Add purpose and usage properties.
Flavio will update the patterns document to include the Workflow realization of the Process pattern. May include a Data Capture example in future. The pattern document could be the basis of less-technical end-user documentation in future.
 Virtual meeting 2016-08-24
ATTENDEES: Arofan, Wendy, Flavio, Dan G., Oliver, Larry
Review of new patterns:
Process Pattern
  • Process Pattern has been revised slightly from last document (Input and Output combined), addition of ProcessControlStep
  • CoreProcess has been changed to Workflow renaming a number of classes to reflect this name change and the use of the Process Pattern
  • FOR DOCUMENT: A class in a pattern can inherit from a class in another pattern. It does not realize it. Realization takes place between a concrete class and a class in a pattern
  • ProcessPattern should be completed later today
  • How GSIM-like is Service Interface (now WorkflowService)? It layers on top of GSIM•Recommended that processPeriod (Datarange) be changed to estimatedDuration (Date-recommend use of Duration format)
Signification Pattern
  • Changed Signifier to complex data type
  • Walked through and revised related classes that now realize this pattern
Methodology Pattern
  • May need additional clean-up of classes not in MethodologyPattern; should they just be instantiated as additional
  • Wendy is revising Sampling to reflect new pattern

AGENDA FOR NEXT WEEK:
  • Property review (what should be removed from pattern classes, abstract classes, and some that realize them) - need some rules also
  • Documentation review - in next week identify those that need review and which are OK
 Virtual meeting 2016-08-17

ATTENDEES: Wendy, Flavio, Larry, Oliver, Dan G. (second half)

AGNEDA:

  • Review and approval of actions on Process Pattern
  • Review and approval of Sign/Signified Patter

Process Pattern

  • Reviewed revised image based on Dan Smith's comment (emailed Dan to see if this resolved his question)
  • The example in the image addresses the differentiation between Execution (Run) Time and Design Time bindings
  • Right now the example we have of the implementation of a Process Pattern is the Questionnaire/Data Capture implementation
    • This is a design time example
    • We are OK with putting out without a run time example but noting that we are working on such an example within the broader needs of a run time layer that provides actual run time binding, plus action logs, variation from design etc. This is part of the temporal perspectives talked about before (looking forward/design, execution/run, looking backward/what took place). This plays into the issue of capturing processing provenance as well as paradata during data capture operations.
  • Input and Output were determined to be part of the pattern allowing for contextual semantics when instantiating the class
  • Discussion of the use of Implemented Variable in Binding Example; could this be a Represented Variable? (thinking in terms of the Variable Cascade)
    • If one were designing a process (i.e. creating a Poverty Index) then the design would probably use a Represented Variable. When the index was used in a data collection then the Instance Variable would be one that was related to the Represented Variable.
  • Question: should Methodology Pattern and Process Pattern be merged? If not, where should Process live?
    • Dan suggested that they were much easier to understand separately; that we could explain the smaller sets of classes more clearly
    • Right now all of our documentation is related to specific patterns except for general rules on how patterns are created and implemented
    • Will need (post Q2) high level documentation  that takes a bird's eye view of the conceptual framework of DDI; how the patterns work together to express similar relationship patterns within different areas of DDI and different applications of DDI

ACTION: Flavio will render new Pattern package in Drupal moving classes to the appropriate locations; any emptied packages will be deprecated 

Sign/Signified

  • Reviewed Flavio's distributed image and earlier document (pg 4)
  • Dan thought name of the class Signified was misleading as it is past tense...Signifiable would be clearer
  • Representation (in Sign) should be changed to signifier
  • This should be made into a Pattern

ACTION: Make sign and signified into a pattern as it separates the existence of concept from the notion of creating a sign (Flavio)


 Virtual meeting 2016-08-10
ATTENDEES: Wendy, Flavio, Jon, Larry
Basic rules regarding realizations:
  • A class can realize more than one Pattern Class as they play a role in a number of patterns
  • Need to make sure there are no property clashes
  • Do we have a way of dealing with this in the PSM
  • Check on the appropriate relationships
  • We need to implement the checks in the processing to an external binding
  • We need to define the rules to check on validation of realizations in the process
Put a statement in asking for review/suggestions for dealing with this
DECISION: Add the InformationFlow (see image) and send out for comments PRIOR to next meeting. We need to get feedback quickly as many are not available for meetings in summer.
Issue around what checks we need to be doing need to be added to XML binding specification
Write up rules for what is needed
  • all properties/relations present
  • multiple - duplication of properties and relations
  • renaming of relationship and capturing change only in the documentation
  • realizing multiple classes
Is the pattern based on anything or just a flower of Flavio's imagination?
Based on discussions of MT and goal to make Pattern classes very generic and need to keep some classes outside rather than inside the pattern
Sign and Signified:
Changing Sign and Signify to a Pattern
This way a Concept can play the role of Signify when a code realizes a sign
This means that Concept would now extend Signified, which would imply that all Concepts are now Signified. I don't think that's what we want to say. A Concept is a Signified only in the context of having a Designation that denotes it. One way of addressing that is to make "Sign - denotes -> Signified" into a pattern that Designation and Concept realize (in the picture, the isA relationship from Designation to Sign and from Concept to Signified would become realizations). That way, Designation and Concept would still extend Annotated Identifiable and we would support the fact that a Concept is not a Signified until a Designation denotes it.
DECISION: Do it and make it a pattern. Put it into Lion.
High level documentation needs revision based on these changes
Who is the audience for the documentation: content provider or the software creator
How to review the XML Binding:
  • Does this work?
  • Is it reflecting the model?
  • We might need to change the binding rules to amend the XML binding
  • This is one option for an XML schema structure (the choice is part of testing)
 Virtual meeting 2016-08-03

ATTENDEES: Arofan, Wendy, Larry, Flavio, Dan G.

AGENDA:

  1. Designation and CodeItem (paper from Flavio)
  2. Cardinalities - relations (spreadsheet)
  3. Process Pattern
  4. Agents and Agent Registry View

1 -

  • Reviewed paper and agreed to go ahead given the level of discussion between Flavio and Dan. It will be a specified area for review
  • ACTION: Flavio will enter in Lion

2 -

  • Reviewed remaining cardinalities noting some additional issues
  • Agreed on final entry for Q2
  • ACTION: Wendy will enter - DONE

3 - Flavio will send something out for next week to review

4 - Review during week and raise any issues - on next weeks agenda for approval

Note: DD is meeting Thursday and will hopefully provide content for Q2 review as a result


 Virtual meeting 2016-07-26

ATTENDEES: Dan G. and Flavio.

We discussed the Designations and Code Item document that Flavio started. The document provides a general introduction to modelling Signifier, Signified and Sign and some of its subclasses. Dan agreed with most of the proposed model, except for the issues below.

Flavio's model had associations to both Code and Category that are specializations of those in Node for Designation and Concept. Dan raised the point that it could be confusing to have those specialization and proposed to add a constraint to the documentation to deal with the fact that the potentially multiple Designations and Codes associated to a Code Item have to refer to the same unique Category. Flavio agreed.

Dan's constraint is the following:

Code Item must contain:

  • One or more Designations
  • At least one of the Designations must be a Code
  • All the Designations must be synonyms, i.e., be associated with the same Concept

In addition, Flavio's model is an attempt to simplify how Signifier is represented. The notion of a Signifier is too abstract for most modelers and users. Flavio's proposal is to make the Signifier into a property of Sign called representation. The alternative is to make Signifier into a separate class.

The former approach is simpler but it can be limiting for some application. For instance, as Dan pointed out, it’s easier to perform lexicographical analysis (e.g. homograph identification, stop words removal, stemming, term categorization) when Signifiers are first class objects and therefore can be more easily manipulated. After some discussion Dan and Flavio noticed that the issue was really more about a physical representation rather than a conceptual one so they decided to have the simpler representation with Signifier as a Data Type in the conceptual/logical model and clarify in the documentation that people have the option of making Signifier into a class in their physical implementation. 

ACTION: Flavio will update the document and model, send it one last time to Dan for review and present it in the next modelling meeting. Flavio will start the cleanup of the Drupal after that.

 Virtual meeting 2016-07-20
ATTENDEES: Wendy, Flavio, Dan G., Oliver, Achim
Cardinality discussion
  • Wendy created a list of all classes with required relations and reviewed them
    • Using the cardinality rules on the Guidelines page recommendations for easing constraints were made by Wendy and reviewed by Flavio
    • A number of constraints were reinstated with specific reasons provided (content in blue on spreadsheet)
    • some items (in red and highlighted in pink) are still under review and will be completed within the week
    • Review by Flavio has been completed through green highlighted item
  • Purpose is to review the logic and determine how to handle questionable items for this Q2 release then complete the list off-line and make the needed changes
  • items highlighted in red will be discussed next week in light of process pattern
Specific points in discussion:
  • CategorySet/ isBasedOn/ Concept logically is required however the question was raised if this would cause difficulties in production processes (i.e. loading metadata from a statistical program). It is not part of a pattern so that factor does not come into play
    • One way to look at it is to send it out with the constraint and see who responds with problems
    • Send out with limited requirements and ask for review
  • Is the abstract NodeSet necessary? Could the two associations isBasedOn and hasLevel be added to Collection or just push the associations to the sub-classes.
    • Keep as is and raise as a review item. it was initiated as an expedient before the Collection pattern was created. It was copied over directly from GSIM
  • We are missing that a codeItem is an element of a codelist. Actually we use the term "Code"
  • A designation has a particular meaning and is the extension base of Code
  • Minimum is to have the CodeItem fixed (Code to CodeItem and value as code) switch extension base to Node
  • Where does Designation fit in? Is it overkill? Its an association thereby the category is designated by the code.
  • This makes CodeItem and Sign identical. However we are have the following requirements:
    • Talk about designations in a meaningful way
    • Talk about Classification, CodeLists, and CategorySets in a meaningful way
  • If we can simplify this lets go for it. (Flavio and Dan will work on this this week)

ACTIONS:

  • Dan and Flavio will review the CodeItem/Designation/Sign situation and look for a way to simply this (for next week)
  • Flavio will finish reviewing the required relations for next week
  • Wendy will enter the agreed changes on required relations
  • Flavio and Wendy will complete work on Process Pattern including Binding issues for next week
  • Wendy will complete revision of Agents and Agent View for next week

DECISIONS:

  • Where cardinality constraints are in question we will leave required and ask for review of these (i.e. are there known cases where this causes problems)
  • Retain Node/NodeSet and raise as a review issue. These were created prior to CollectionPattern and may not add anything to the structure
 Virtual meeting 2016-07-13
ATTENDEES: Wendy, Flavio, Oliver, Larry
ACTION:
  • Implement cardinality recommendations for properties and deal with any related issues during review.
  • wlt will implement
PATTERNS:
  • Everything in a Pattern should be realized when implemented
  • Everything in a Pattern package extend (does not realize)
  • Implications of specifying what is in a pattern and what isn't
  • Process: only the very abstract pieces would be part of the pattern and are realized
  • When things are patterns can you realize something that is at a more
  • Extension to classes in a pattern stays in the pattern
  • Classes outside the pattern realize the pattern and can serve as abstract bases for extension
  • Need to review Custom Metadata in its use of the pattern in terms of explicit relationships
  • Do we need all of the relationship possibilities or just the common ones
ACTION: For Collection
  • Get the realization out of the pattern - document what classes can be realized
  • Document the dependencies - not all classes can be realized independently
  • Review implementation/realization of the pattern in the packages to make sure it done correctly
ACTION: For process and methodology
  • Finalize an initial Binding approach
  • Look at 3 of the classes as part of the pattern (Methodology, Design, Algorithm, Process) and other related classes could be realized...decide which
  • Create a suggested approach for methodology group
ACTION:
  • Write up guidelines for how to decide what is and is not in a pattern
    • Need for local naming
    • Need to retain certain hierarchies
BINDING:
  • ProcessStep is on the abstract and Binding non-abstract so that a class needing an input output needs to realize a ProcessStep (Act, ControlConstruct)
DOCUMENTATION: wish list
  • Being able to see the classes that point into a package as well as out of a package would be VERY useful - comment attached to DMT-5 on 2016-07-15
 Virtual meeting 2016-07-06

ATTENDEES: Arofan, Wendy, Flavio, Larry, Dan G.

Reviewed the following documents, edited, and posted:

  • Cardinality guidelines
  • Property fixed and default values
  • Identification, Annotation, and Complex Data Types

ACTION: Implement decision regarding Identifiable and Annotation - completed 2016-07-06 wlt

NEXT WEEK: Pattern Review 

 Virtual meeting 2016-06-29

ATTENDEES: Wendy, Flavio, Achim, Jon, Larry, Oliver, Dan G. (Arofan had a work conflict)

AGENDA:

  • Cascade as a pattern
  • Cardinality documents from TC
  • Identification documents from TC
  • Document on fixed and default values
  • Pattern realization issues arising from Flavio's work

Pattern realization issues arising from Flavio's work

  • Reviewed the description of abstracts in DMT-21 as written at the Edmonton Sprint
  • Item 2 seems unclear: Patterns are implemented with "realizes" relationships but (mostly) use sub-classing from abstract classes
  • This is interpreted as meaning that the major "abstract" entry points in a pattern are realized but the overall pattern contains abstracts that serve as the heads of sub-classes
  • All of the Collections package is a pattern
  • We need to go through and verify what is abstract in a pattern - anything we don't want people to use directly (instantiate) and for which they should use sub-types
  • ACTION: Wendy will go through the classes in a pattern and suggest which should be abstract and why (capturing the rules base as this is done); Flavio will review and once final decision is made we will have a set of rules for creating and applying abstracts in the model. These decisions can be part of the Q2 review and revisited in light of any comments.
  • ACTION: After we insure that everything in the package is part of the pattern, add the term "Pattern" to the name of the package. Documentation should also be added regarding the fact that the content of the package is a pattern and should be dealt with as such.

In UML patterns should be interfaces

  • We don't have this option in Drupal
  • ACTION: Entered DMT-88 to review our options in terms of the use of interface and implications for Drupal at a later date

Change of Member to realize

  • We can use extensions within a pattern but outside the pattern is by realize
  • So a Member in the collection pattern is not realized but is an extension
  • Patterns should extend Identifiable and let the instantiation determine if it should be AnnotatedIdentifiable or just Identifiable
  • ACTION: Flavio will make this change. NOTE that means everything within the pattern becomes a simple Identifiable. Annotation can be added at the point of instantiation

Idea that NodeSet and Node are a type of pattern

  • CodeList and Category Set are both extensions of NodeSet but Code and Category are not an extension of Node
  • We saw we had similar things going with Category Set, CodeList etc. So a Node has categories attached. GSIM uses CodeItem and CategoryItem.
  • ACTION: We will do this (create a CodeItem and CategoryItem) and test out the consiquences in Q2

We have datatypes and value domains in the model but they are not related

  • DataType should be a property of type ExternalControlledVocabularyEntry  (currently an annotated identifiable containing a only a property of type ExternalControlledVocabularyEntry
  • Shouldn't there be a relationship between datatypes and value domain. In fact a datatype (ISO 11404) is closely associated to datatype
  • The usecase Flavio had in mind is that when its associated the with the InstanceVariable and datum at physical - should a value domain have a datatype?
  • ACTION: Remove class DataType and replace relations in RepresentedVariable and InstanceVariable with the appropriate property and data type ExternalControlledVocabularyEntry (DONE)
  • ACTION: Add to ValueDomain property recommendedDataType 0..n ExternalControlledVocabularyEntry (note that there may be more than one type and/or more than one External Controlled Vocabulary (DONE)

NEXT WEEKS AGENDA and FOLLOW-UP WORK:

  • Request review of the 4 documents noted in this weeks agenda PRIOR to next weeks meeting so we can identify any issues that need discussion to try and move through these quickly
  • Move those items plus Cascade discussion to next week's agenda
 Virtual meeting 2016-06-22
ATTENDEES: Wendy, Arofan, Flavio, Jon, Larry, Oliver
DMT-75 OK change Member to Identifiable
Done
How to realize the pattern in Drupal -
  • There is no easy way of representing this in Lion (don't have the ability to create a fixed value in Drupal
  • Is there a way to do this in the transformation to turn these into fixed values? Does this have to occur in Drupal and be transported through XMI
  • Default values don't come through either - need a column to set default/fixed and declare the value
  • We need to overwrite when properties are inherited from BinaryRelation
  • The overwriting mechanism takes place when flattening takes place
  • Change documentation - "Value must be ..." "Fix value to..."
  • "Default value is xxxx"
  • Flavio will write this up for addition to Guidence documents
  • Create issue to capture this and deal with this in Drupal post-Q2
What else should be a pattern?
Methodology -
  • Think it is a pattern - Process and Design
  • How would this be realized in Sampling
  • Come up with a document of what we want to do and present it to the Methodology group to see if it makes sense

Cascade - next week

List of relationships - when and how used...good documentation
  • realizes - only used when realizing a pattern
  • implements - clearly a relationship
  • denotes - definition needs to be adequate
  • something we meant as realizes but it was backwards isrealizedby
  • See Dan's document about relationships 
Cardinality - TC will discuss tomorrow and get to MT after meeting
 Virtual meeting 2016-06-15

ATTENDEES: Arofan, Wendy, Achim, Jon, Oliver, Larry, Flavio, Dan G.

AGENDA
Idea that Jon had in Edmonton on how we packaged the documentation and schema so they make sense to the user
  • Suggestion was to park this and get responses from review process and focus on that for Dagstuhl
Document on Use of Standard Properties in DDI 4
  • Hierarchical display?
  • Maybe a separate package for building blocks for ComplexDataTypes that are not to be used directly
  • Usability: move this into Drupal as explanatory text
  • Test out what happens with the property tags - Jon tested it worked!
  • Supporting data types - in another package? does this create a problem viewing
  • Is there a way to get this grouping into Drupal
  • Solution: remove Property, but remove ComplexDataType from list of classes that show up as relation targets
  • Should we circulate? Make a clear home for these on the Modeling Team site
ACTION: Create clear location on Modeling Team page and circulate document along with link
  • Guidance for Business Modelers Page:
  • Make the DO NOT usage things more visual
  • Update the documents on Functional Views

AGENDA next week:
  • Additional patterns/micro patterns - methodology, cascade (variable, instrument, question), relation list (implements, realizes, denotes, etc.)
  • Page 'Guidance for Business Modelers': classes to views (3 papers), design patterns, complex data types, what other "how to" do we need
  • Cardinality rules - when should something be mandatory? TC will work on rules for this and send to Modeling for approval and implementation
 Virtual meeting 2016-06-08
ATTENDEES: Wendy, Arofan, Achim, Flavio, Oliver, Larry, Dan G., Jon
  1. A number of classes have ben added that don't connect to anything
    1. i.e. freezing what's going out in Q2 didn't happen
  2. Still lots of new classes that have Annotated identifiable extention base and NO content at all
  3. They did not use the instructions for building the codebook Functional View...Larry built it from his spreadsheet so it has problems which have given rise to Ornulfs comments
  4. I am reviewing all the content, again, for pattern usage, empty content, and orphan classes.
  5. also doing a full review of the Functional Views. My feeling is that any Q2 FV's should have a reasonable use case (AgentRegistry, data dicationary combining logical and physical data description, statistical classification, Catalog, basically nothing without a real use case)
  6. Data Description is still working but they have decided that maybe they should focus on describing rectangular and CSV files
Realization of a FV that was complete:
  • Restrict only relations by non-inclusion of class - Documentation to describe this
  • Ornulf's paper - Created at the end of the codebook sprint and is his opinion
  • Jon - position wasn't that everything to be in FV
  • Possible FV's SimpleInstrument, Agent Registry, DiscoveryView, DataDicationary, StatisticalClassification
  • The purpose of Codebook is to test all the perspectives of the DDI Model - Production, content, model level, transformation to specific bindings, documentation
  • Limit properties in documentation only listed in the FV level documentation
  • Add note to instructions that properties that are not pertinent should be documented
  • There is also work needed to complete the original Q2 content in terms of applying patterns
  • There is also alot of work being done on the documentation  https://ddi-views.readthedocs.io/en/latest/index.html
  • XML binding problems: codebook pulls from many parts of the model - usability of the schema itself and of the documentation
  • Could we do a small focus group to review the XMI, Documentation, etc. in the package and walk a crew of people through the package?
  • Who do you get to run this? List of questions and guide wherever you want to go. Look for someone to do
  • Is there a general pattern for Conceptual, Represented, Instance? Variable, Question, etc?
  • Bring the idea of combining Q2 and Codebook to AG
 Virtual meeting 2016-05-18

ATTENDEES: Wendy, Flavio, Larry

Reviewed a draft (verbally related) of the details of how to go about selecting and entering Functional View content. Looked at the list made by Codebook group as well as the tool Larry has been developing. Instructions will focus on the intellectual side of selecting classes for entry, decision points, and documenting. Mention will be made of work being done on tools to facilitate this but specifics won't be added until tool availability is more stable and Drupal is altered.

Looked at bindings and talked about draft of recommendation for Binding changes. This involved a recommendation to change Member to Identifiable and short discussion of reviewing patterns to see where these could be stripped down so that simple usage did not carry as much baggage. Recommendation was entered as a DMT issue.

 Virtual meeting 2016-05-11

 ATTENDEES: Arofan, Wendy, Larry, Jay, Oliver

AGENDA:

  • 71 Q2 read and comment on for next week
  • 70 Q2 Do through binding issues (fix at that point) - DONE to this point
  • 69 Q2 DONE
  • 68 Q2 DONE
  • 67 Q2 implemented -- document wendy
  • 66 Q2 should decide and do  
    • We know and will address when Geography is sorted out
  • 59 Q2 in process Wendy Arofan  
    • Go with the edited view
  • 34 Q2 draft almost done - Wendy
    • Complete prior to Bergen Put on 5/11 agenda  Go with what is there and expand for Norway - test it out with them, this is an iterative process. 34 will stay open for some time

Regarding Views:

  • Some continued concerns about how to decide what to include, at what depth, relationships between major objects, etc.
  • having a graph of the view in Drupal would be helpful

Larry will add new issue on purpose of DocumentInformation and its use

  • DMT-72 DocumentInformation is seen as both summary discovery information and metadata documentation
 Virtual meeting 2016-05-04

ATTENDEES: Wendy, Dan G., Jon, Flavio, Oliver, Larry

ITEMS for next weeks agenda:

  • 71 Q2
  • 70 Q2
  • 69 Q2
  • 68 Q2
  • 67 Q2 implemented -- document wendy
  • 66 Q2 should decide and do 
  • 59 Q2 in process wendy arofan
  • 34 Q2 draft almost done - wendy Complete prior to Bergen Put on 5/11 agenda


Dipsosition of remaining issues from last weeks review

 for Q2

  • 63 Q2 in process
  • 61 Q2 decisions made and implemented writing documentation
  • 59 Q2 in process wendy arofan
  • 54 Q2
  • 53 Q2
  • 51 Q2 XML is in hand
  • 50 Q2 jon
  • 49 Q2 draft almost done - wendy
  • 47 Q2 wendy
  • 36 Q2 entries have been reviewed...need to extend to remaining content
  • 31 already done for schema production, documentation is a bit more of a problem. Source files are not being updated automatically on read- the-docs. Is it possible to do this in Bergen? If not then Dagstuhl Sprint or resolved in between. 1) how the source documentation gets shoved into XMI and who writes something to get it out. Bergen Oliver and Jon, Achim will look at this. Push RST stuff into a field in XMI from which it can pulled out. Sort out exact problem. Not a show stopper for Q2

Codebook

  • 55 Codebook

Later

  • 52 later
  • 46 later
  • 43 later - linked to 31
  • 42 later - linked to 31
  • 39 later - would be nice to have for Q2 could be a side effect of Bergen by having instances for Codebook into DDI4
  • 37 later
  • 32 later
  • 28 later
  • 27 later


 Virual meeting 2016-04-27

 ATTENDEES: Wendy, Flavio, Dan G., Oliver, Larry, Achim

Realizing a pattern - as a class, in a view

  • A class must contain all the parts of pattern that it realizes (it must be the full pattern)
  • Restrictions of relations takes place in a Functional View

Creating a View:

  • Only classes listed in the view will be available in-line within an instance of the Functional View
  • Classes can be constrained through documentation and by NOT including a class that has a relation to an included class (For example if I included a Process but did not want to allow a description of the ProcessSteps and routing I would NOT include the class Sequence)


Include in a View -

  • DocumentInformation
  • Annotation
  • Enter each primary class
  • You do NOT need to enter Extension Base or Abstract classes as this is handled in the tranformation process and binding
  • Enter each class with a relation to that class which you wish to include in-line
  • In the View documentation include the following:
    • Purpose of the Functional View
    • Use Case
    • Audience
    • List of constrained classes (those where one or more relations were not entered) - document use
    • Documentation of any specialized use of classes within this Functional View
    • General documentation on the use of the Functional View

Additional notes for documentation: 

  • [include discussion of how Abstract classes are incorporated]
  • Viewer has to respect constraints in terms of required relations. - means we need to be very careful about required classes
    • There should be very few 1..1 or 1..n relations but they need to be explicit in the view.


Binding: Arofan and Wendy are reviewing for functionality

CommandCode:

  • Originally a ComplexDataType containing a set of unidentified means of relaying specific coding plus a general description, in 3.2 these were always part of an identified packaging object such as a ControlConstruct or ProcessingInstruction.
  • DDI4 has changed each code capture type and made them Annotated Identifiable classes (the ComplexDataType still exists but now consists of an unidentified bundle of 3 identified classes)

Goal:

  • To retain the ability to express a set of code in one or more ways
  • Get the proposed solution together for group to decide

Before Q2:

  • 4 - high  Achim
  • 5 - high  Achim
  • 7, 26, 30 Wendy
  • 9 - Oliver keep updated
  • 12 - document
  • 16 - better to do sooner han later
  • 20 - Jon
  • 21 - Wendy
  • 23, 50 - Jon

Before Codebook:


Later:

  • 2 don't lose but its not a showstopper
  • 3
  • 6
  • 18, 22,  also related to structured documentation
  • 25
 Virtual meeting 2016-03-30

ATTENDEES: Arofan, Wendy, Dan G., Larry

Discussed organization of NADDI Sprint including draft of content, ground rules, and priorities

Added DMT-10 as a critical issue: versioning of the model

Wendy will review issues to see which have summaries and recommendations (primarily those from TC review Q1 results) and identify those that need this done

Larry will prepare a summary for DMT-10

Very ambitious schedule requiring some ground rules:

  1. Issues will have a summary and framework for discussion prepared before presentation to the group (including virtual attendees)
  2. Small groups (1-2 person) will be assigned to prepare these for discussions later in the week. Early discussion topics will have this done prior to the Sprint
  3. Goal is for a testable approach. It may not have full agreement but it will be put out for review with request for particular attention to points of concern 
 Virtual meeting 2016-03-23

 ATTENDEES: Arofan, Wendy, Oliver, Flavio, Larry, Dan G., Jon

AGENDA:

Process model update (Action items in bold)

  • Flavio has made some small changes on the model in terms of how temporal associations work
  • Methodology Model should be in shape to be discussed at NADDI in terms of how it interfaces with Core Process
  • Flavio noted things that required slight modifications to clarify the relationships between Process and ProcessStep, that the steps are contained by the Process
  • However we capture the semantic of perspective it should be consistent in how we tell people to do it (the pattern or the use of the pattern)
  • In respect to capture a process model would be used to describe an instrument/capture of data
  • There are a lot of funny connections you have to make to make that work
  • What is the complexity of data editing and processing? depends on when you do it? Imputation is a good example for complexity. Educational test scoring could also be complex.
  • We need to identify some use cases that would be good to test against. CAI systems (Dan and Jeremy); Imputation process; others?
  • You could do a lot if you could just represent that this script follows that script using these inputs and producing outputs
  • In review we should ask people to test use of related code (SPSS, SAS, BLAISE, etc.)
  • Qualitative: Analytic Metadatum - there seems like theres an extra level of abstraction there
    • This may fit into discussions of abstraction and use of Member
  • Review of additions to document on modeling temporal relationships:
    • Two ways of expressing sequencing (page 8) traditional sequence ordering by creating strictu ordering relations
    • What is new is the addition of Sequence Order Relation containing Sequence Order Pair
    • Temporal Asymmetric Relation extension of Asymmetric and Preceds Interval Relaitons (realizes Temproal Asymmetric Relation) containing Precedes Interval Pair
    • Are we giving people more than one way to define the same thing? Not really. It lets people get fussy when they need to.
    • A little concerned about use cases. Are we modeling more than we actually ever need. I think with medical cases, trials, educational testing, and parallel processing. There is a case for a simple order but when they need the complexity they need it. We should raise this as a question in testing.
 Virtual meeting 2016-03-16

ATTENDEES: Wendy, Flavio, Larry, Oliver, Johan

Discussed updated document on Processing Model and issues arising in Methodology Team work

Methodology model suggested changes: (note that Methodology Group has not discussed this proposed model change yet. It was brought to MT because of impact on Core Process relationship between Process and ProcessStep) see Methodology Group minutes for 2016-03-21

  • Change in Process hasResult to 0..n
  • Add in Process hasSequence 0..1

Core Processing Model:

  • We want a ProcessStep separate from Process so we need a concrete class of Process (in Methodology). ProcessStep should no longer be an extension of Process.

Document:

  • Went through additions in Process model document and noted where additional documentation was needed. Described Temporal relationships in sequencing. Expanded model on page 3. Explaination is done in 2 pieces (temporal relations in sequencing)
 Virtual meeting 2016-03-09

 ATTENDEES: Arofan, Wendy, Dan S., Oliver, Johan, Larry, Flavio, Jon

DMT-58:

  • The way the variable cascade is modeled the representation of values resulted sentinel values only at the instance level.
  • Represented only had substantive values.
  • Statistical packages recode and have different means of recording missing values.
  • How do you indicate that you want particular categories for the missing values.
  • Proposal of switching hierarchy of described/enumerated and substantive/sentinel.
  • Add the ability to add sentinel categories (not codes) at the represented variable

Dan Smith:

  • The point of the stratification is to have conceptual, represented, [ISO 11179] and use (instance variable)
  • If we do this we have things that are note "represented" but just conceptual
  • The capture collection and questionnaire link to the represented variable
  • If you don't have the codes at the represented level how does the capture software understand what to use to code.
  • The conceptual variable doesn't say you have these specifics of how things are
  • Do we have a mechanism for mapping sentinel values where this is required
  • Should the specification of substantive and sentinel be moved to conceptual (only categories not codes)
  • Example: A csv 1=Male 2=Female If I right a script to change these codes to 0, 1 this doesn't change the logical content

Solution:

  • Conceptual Variable "uses" SubstantiveConceptualDomain and SentinalConceptualDomain (each with their enumerated and defined)
  • Need to make sure all inherited things are optional
  • We need to write up and present to Dan G.


Issues of comparability:

  • I have a concept that is hierarchical and therefore can support a determination of comparabitlity at various levels
  • If you are trying to do this programatically without humans to make decisions we aren't going to get there. We need to provide sufficient content for people to make decisions within a specific context.
  • People who reuse variables get this (maybe not the terms) but the differentiation makes sense to them.
  • Because the components are optional you can relax or constrain more based on the usage situation


NEXT WEEK:

  • Review expansion of Flavio's document
  • Review Larry's remodeling to reflect discussion of DMT-58
 Virtual meeting 2016-03-02

ATTENDEES: Arofan, Wendy, Larry, Jon, Flavio, Oliver, Dan G.

Collection Pattern:

  • Added n-ary relation to the paper to cover things that could not be represented in binary relations and to support a more compact representation of binary.
  • Added example of classification.
  • More could be written (explain correspondences) to provide a bit more general description of correspondence and relations. This would add a page or two.
  • Incredibly useful document if the audience is the people who need to wrestle with these things, i.e. modelers. We need a version intended for the implemention audience. A use case approach would be helpful for this audience.
  • Added the Ordered Tuple and Asymmetric Binary Relation. This could be a short hand for relating binary pairs. In hypergraphs you don't have equivilent pair relations.
  • We need to document when we recommend the use of n-ary in a bindary relation (heirarcy situation of one parent with multiple children).
  • Issues of direction for in mapping uses asymmetric (source and target) others are symmetric (maps).
  • While a binary is a specific case of the ordered tuple it is easier to understand. Also the properties of reflexivity would then only apply to the binary case and would be more difficult to explain and model.

Issue from Codebook:

  • Need for methodological description in codebook at a high level. Methodology group is focused on detail.
  • The main modeling issues is how this should be accounted for in DDI as this seems like it is a common pattern.
  • Duplicative content if you have two separate. We should have an element/property like Overview that is used in a consistent way.
  • File it in DMT

Process Pattern:

  • Are we confusing people?
  • The diagrams are from the Core Process.
  • How do the multiple process packages tie together?
  • As of Dagstuhl there was a tendency of getting rid of the symantic of historical and proscriptive process. Doesn't impact Core Process model. Only a few things in addition are needed for historical.
  • Find the discussions on Role from Dagstuhl so we can incorporate those extensions.
  • The pattern should be the common stuff and other stuff needed should be on the concrete use of the pattern.
  • Workflows and Bindings are core.
  • Have some words around process before the Sprint and completion of model during the Sprint
  • Flavio will try to put something for next week on more of the collection pattern and something for the process pattern.
  • Divide Control Constructs and Bindings as separate use cases
  • Having a stated static version of the pattern comes first.


Sentinal DMT-58

  • Suggest adding an enumerated sentinal domain
  • Its an issue of reuse of the Represented Variable.
  • Get Dan Smith on call next week to discuss this. 
 Virtual meeting 2016-02-24

ATTENDEES: Wendy, Larry, Achim, Oliver, Flavio, Dan G., Jay

AGENDA:

  • Additions to Collections Document
  • Start on Processing Document
  • DMT-58

Additions to Collections document

  • Send to Jon for him to pull off a "cookbook" for realizing a Collection or modeling a relationship.
  • There is still a bit of detail to be provided by Flavio but Jon should get a start
  • Will this support multiple relation types (BLS example). It should be flexible enough to handle this as each pair can have a different relationship
  • The release should be tested against this type of collection and the results fed back into the documentation

ACTION:

  • Send to Jon
  • Flavio will complete work
  • Stress test it against the BLS example (before or following release of 2016 Q2

Start on Processing document

three images from email

  • Image 1: have the conditional control construct requires a sequence; ElseIf has to be used (optional) with IfThenElse; Sequence/constructor provides a means of dynamically creating the sequence; Sequence is the mechanism for nesting; process step has two subtypes (ControlConstruct and Act)
    • Question: Do we need RepeatWhile and RepeatUntil if we have Loop? These take different kinds of input and output, and if you don't have the different kinds you have to have some form of transformation to fit the construct (Loop or IfThenElse). Its more for convenience than anything else to allow people to simplify their code.
    • Still need good documentation
  • Image 2: We can see how this aligns with the relation changes we've made to the model; Create a sequence order or constraints expressed by a Temporal Interval Relation
  • Image 3: The bindings are just connecting the inputs and outputs; this has not been reviewed with examples; needs testing; the bindings are part of the control construct (see Dagstuhl presentations)
    • Test the ability to bind at the point of "reference" (reusable processing instruction - calculation for a median for example)
    • Need a good set of test cases for this

ACTION: Develop list of examples

Issue DMT-58

  • When you look from the instance variable and represented variable side you have a different view from when you look at from
  • Changes to hierarchy Substantive and Sentinel containing Enumerated or Described
  • Question: Is there ever a case of described sentinel value. Yes there can be number ranges etc.
  • Question: Why the association from the Represented Variable to the Sential Value but the DescribedSentinel Value Domain points to the Instance Variable? Isn't the Value domain always the target? Variables share value domains and the domains don't always track where they are used. We need to be consistent in this.
  • Red box: From ISO/IEC 11179 you can associate a Conceptual Variable (Sex) points to a conceptual domain (Male Female)
    • A Represented Variable points to Male Female and Other then there has to be a conceptual domain that contains Male Female and Other so the two conceptual domains don't coincide and you want them to. We shouldn't specify sentinel categories at the conceptual level. Why not? because then we would have to separate the conceptual vs the sentinel categories and have two conceptual domains associated with the Conceptual Variable. if your conceptual domain has both when you get to the represented you'd have to have codes for all that and the whole notion of separating the two was so we didn't overload the value domains with both.
  • We haven't agreed that the sentinel should be described at the conceptual level.
 Virtual meeting 2016-02-17

ATTENDEES: Arofan, Wendy, Dan G., Achim, Flavio, Oliver, Johan, Larry

Flavio's DDI Patterns document

  • Do we need to expand our profile if this is how we are describing it in our model (realizes semantic)
  • Would a realize share all the methods? No you need to recreate them, this is just a pattern. It requires that you implement all the methods. It's bloody minded inheritance with the classes and attributes.
  • What are the mechanics behind the relationship? You can constrain but not expand.
  • Class Binary Relations should be a simple association Member to OrderedPair and Member to UnorderedPair
  • All the dashed lines are enumerations and won't be in Drupal
  • The way users will try to understand these will be trying to understand the semantics
  • One of the places to start is Dan's paper on how to break apart symmetry, reflexivity and transitivity
  • This is a formalism that is behind the scenes. The end user needs separate documentation (see Jon). These documents are used for modeling background.
  • Flavio will complete this document over the next two weeks. Jon should then try to make a sensible user document.

Will need a similar piece for the process pattern. Flavio may be able to start but Jay will need to fill in the gaps.
Flavio will send along what he currently has. There are some snippets, explanations and examples in the Dagstuhl 2015 presentation,

Add to model review: make sure cardinality supports production process

AGENDA for next week:
Additions to Collections document
Start on Processing document
Issue DMT-58 

 Virtual meeting 2016-02-10

ATTENDEES: Arofan, Wendy, Oliver, Jon, Dan G., Flavio, Larry, Achim, Jay

Flavio Images on Relations

Binary Relation - Introduce the Symmetric Binary Relation and the Antisymmetric Binary Relation

  • Leaving it so the inverse could be implicate in Symmetric Pair
  • What happened to the may or may not be symmetric - Can Antisymmetric deal with this case as it is a sub-set (the default is the ordered pair) asymmetric (You'd have to add a third class if there was an option other than Ordered Pair and Unordered Pair

DECISION: Generalize Antisymmetric Binary to Asymmetric Binary and the Ordered Pair; Anti-part is specified in the child classes

Binary to N-ary relationships (image with note)

  • Assuming that we can't represent everything with binary - we lose some things in translation so we add the green boxes which are generalizations
  • For example we want one tuple with one source and multiple targets to support a more compact representation of a tree
  • Its a more compact representation so doesn't replace the ordered pairs
  • Could associate the n-ary with the binary specifications
  • This introduced multiple ways of implementing...should we provide guidance to always use the n-ary form
  • Is it a matter of usage or a matter of type of structure.
  • We are designing a pattern so when implemented we should be able to specify how its to be implemented
  • Needs strong guidance for implementation of this pattern - so where do you one or the other
  • Ordered Collection Correspondence and Ordered Member Correspondence can realizes the abstract or isA
  • Prefer to have isA rather than realizes
  • Do you need to define a collection to define a relationship
  • The relation points to the collection rather than the collection
  • I get a collection of some sort? How do I know I have all its information?
  • I might order a collection that someone else owns?
  • Members and context of a collection. The collection here is just a set of members
  • if we look at a classification and correspondence table it points in the same direction

DECSION: Directionality is correct

MAJOR ANNOUNCEMENT: Arofan CONCEDES THE POINT to Flavio!!

  • This will imply that we want to review what things are members as we review the use of patterns
  • If we need weighted pairs they can just be another attribute in the ordered pair (optional attribute rather than another pair classification)
  • May also want a semantic...is this a function of the abstract or is it a similar thing to weight
  • We have the capability of putting in a controlled vocabulary now so this should handle it
  • Lets implement the pattern and find the needed attributes as we go
  • Are there other major outstanding issues with Collections? no we just need to get it in there.

ACTION: There is general agreement with the change. Can we get it documented and move forward in Lion? Flavio will update in Lion and verify that the classes are used in the right way. Whole thing could take 3 weeks. First couple pages for next week so we can review as he writes.

FUTURE AGENDA ITEMS:

DMT-58

  • Takes value from and measures
  • The representation side we wanted only substantive values
  • The relationship takesValueFrom allows for both substantive and sentinal
  • Fix relationships in Represented Variable to be more specific
 Virtual meeting 2016-01-27

ATTENDEES: Arofan, Wendy, Larry, Jon, Achim, Flavio, Oliver, Johan, Dan G.

AGENDA:

  • NADDI Sprint
  • Pattern Implementation
  • Decide if round-tripping needs to involve the PIM or just the PSM/Bindings; determine what needs to get specified prior to Q2-2016 release - include used defined subset of UML document (Copenhagen) as input to discussion (2016-01 JW) We need to define what we mean by round-tripping Between two specific bindings PIM

Round-Trip Issues:

  • usage provides the definition - from one binding to another without loss of information
  • Are all bindings equivalent?
  • Priority 1 and Priority 2 bindings
    • XML is a Priority 1 binding
    • RDF is a Priority 1 binding
    • Storage and programs are also important - Relational data bases and program languages like JAVA
  • We've had to make modification on how we do modeling because of issues around the binding
  • Depending on what you are binding to you have to make some "approximations" to accommodate the binding
  • All these languages do not have the same expressive power
  • Do we restrict ourselves to the intersection of all these language to support round-tripping
  • We need to make sure we have the binding specific information of one has a home in another
  • The bindings are not necessarily clean. There are difficulties in mapping between the PIM to the specifics of the binding.
    • There is a degree of collapsing between Abstract to specific classes.
  • If you don't use the full expressive powers of RDF you are OK. There are things in RDF schema that are not in XML.
  • The most important thing is having bindings that people can use to realize DDI in some software
  • So it is the intersection of XML, RDF, RDB, and JAVA-like languages
  • From conceptual model to any binding is a requirement
  • Being able to go from one binding to another binding this can be a different issue
  • Example a "realization" is lost because we deal with it one way in XML and a different way in RDF
  • How big is this intersection as you add more bindings
  • We have this idea that there is supplemental information to the PIM. We know the optimization used to create specific PIMS and therefore should have the information to make those linkages.
  • The round trip requires reversing the PSM to the PIM in order to use the knowledge of the of the PIM
  • So not from Binding to Binding but from Binding[1] to PSM[1] to PIM to PSM[2] to Binding[2]
  • If I'm going from RDF or XML to Java I need to take flattened characteristics of the classes I need to have access to the PIM to know where the abstract classes go
  • Keep the intersect in mind when doing the PIM to PSM processing
  • Keep in mind the RDB and JAVA-like languages, but if we are retaining the PIM in the translation process this broadens the intersect set
  • We should think about JASON, YAML would be pretty straight forward. Might want to play with model ddi schema in yaml https://drive.google.com/open?id=0B7q7KgKeg0ArbU9QWnFBdG5LcU0

 DECISION

  • We have this idea that there is supplemental information to the PIM. We know the optimization used to create specific PIMS and therefore should have the information to make those linkages.
  • The round trip requires reversing the PSM to the PIM in order to use the knowledge of the of the PIM
  • So not from Binding to Binding but from Binding[1] to PSM[1] to PIM to PSM[2] to Binding[2]
  • If I'm going from RDF or XML to Java I need to take flattened characteristics of the classes I need to have access to the PIM to know where the abstract classes go
  • Keep the intersect in mind when doing the PIM to PSM processing
  • Keep in mind the RDB and JAVA-like languages, but if we are retaining the PIM in the translation process this broadens the intersect set
  • We need to test this out as soon as we can to make sure this works - Use as a test of Codebook and Functional Bindings (XML and RDF)
  • List of prioritized bindings: XML, RDF, RDB, JASON, JAVA/CSharp, Python
  • Avoid multiple inheritance

NADDI Sprint:

  • Live:  Wendy, Larry, Jon, Arofan, 
  • Virtual (part-time): Dan G., Achim, Oliver, Olof(?)
  • Check with Jared about funding (Wendy)
  • See if Alerk is available? (Arofan)
  • Michel Dumontier (Achim to ask) - to review RDF
  • We need to get a group of people conversant in RDF and need meetings to prepare for production
  • If Olof has got some time offline that is important for this sprint, also Johanna

NEXT WEEK:

  • Get exact agreement on the implementation of patterns
  • Flavio will send proposals for correspondences


 Virtual meeting 2016-01-20

ATTENDEES: Arofan, Wendy, Larry, Jon, Flavio, Dan G.

AGENDA:

  • Continuation of DMT-48
  • Week - modeling issues, methodology, content related to who may be available. NADDI is April 6-8th Wed/Fri, 11-15 Sprint

Continuation of DMT-48

  • Relations are defined on collections and it doesn't seem reasonable to create pseudo-collection
  • We can create correspondences between Members without collections - we might tolerate members without collections
  • We could have an explicit rule that the top member is always a collection
  • If we do that we go back to defining the things that we need to have correspondences for (a specialization of a relation). The creates unnecessary artifacts in the instance.
  • Correspondences in different collections and relations are within a collection - is this a meaningful distinction?
  • We can make just a single solution by making a specialization of relation in correspondence but requires the creation of a collection. This is an intuitive specialization.
  • A correspondence is some kind of relation; right now it is an equivalence relation (MemberCorrespondence, OrderedMemberCorrespondence)
  • The idea of a correspondence table is that some correspondences are not one-to-one (consider merged and divided classifications)
  • Are they transitive or not? We have symmetric (bi-directional) and non-symmetric (ordered relationship)
  • In general you don't have transitivity, are there cases where there are? Is this an argument for using our relations to model this.
  • A correspondence is a relationship (we have 2-ples in the mapping tables) and there is semantic around that that makes it a relation.  These things are built up as relationships as we know them, but the semantic is far from clear.
  • If the underlying definition is mathematical then you need a set. There is a difference between a set of 2-ples. We have the idea of the 2-ples, the idea of the collection of the 2-ples, and the idea of the meaning behind the collection. So a collection of 2-ples could have different meanings. The same set of 2-ples could manifest as 2 different sets of meaning.
  • We wanted the semantics as controlled vocabularies.
  • The set is a necessary construct because we want people to compute it. The semantic is something quite different. Are we talking about a node in a node-set or about an order in an equivalence.
  • We need an agreed vocabulary because we seem to be talking around each other at times
  • Semantic is the meaning/understanding behind the 2-ples that we care about. Partitive, super-type, sub-type, etc. Dan's definition of semantics.
  • We do have that in the model as an external controlled vocabulary. We have examples of these semantic vocabularies.
  • We have a primary semantic - you've realized something and it is "blank". Then you have descriptive semantics such as partitive, transitive, symmetric, etc.
  • Primary semantic is what it is and descriptive semantics specify the structure of the relationship.
  • Set - realized by a primary semantic that tell you what it is
  • Members - realized
  • Primary semantic - a name that describes what something is rather than its properties 
  • (i.e. classification item with specific properties)
  • Member is ...
  • Member-to-Member is ...
  • Specific properties provide the mathematical properties
  • The gotcha here is that correspondences occur in a context so the equivalence is dependent upon what the data is being used for.
  • What is the level of granularity is needed? A realization of the pattern can be more granular.

NADDI

  • We have to address reviewing the model. We've made lots of changes to CoreProcess, DataCapture etc. and we need to clean up. We need a consistency review at NADDI. Modeling consistence and creating some of the Functional View of existing portions. How to documentation also must be done by this time.
  • One thing that will be critical is having the XML implementations of these patterns. Primarily issues of abstract heads of substitution groups. We need to test these in reality to validate the rules in the binding. Rules are often in documentation rather than schema.
 Virtual meeting 2016-01-13

 ATTENDEES: Arofan, Wendy, Flavio, Dan G., Jay, Jon, Larry, Johan

AGENDA:
Update

  • DMT-36 has an attachment which identifies locations for further review and discussion. At minimum rules should be clarified. Flavio and Wendy are working on this.
  • DMT-33, 35, 38, 14, 8, 13 are DONE
  • DMT-36, 34, 11, 2, 23, 29, 31 are IN PROCESS

Review Relations and Correspondences to see if there is anything we need to change SEE DMT-48

  • What we are finding is that what had makes sense prior to updates in the use of patterns over the past 6 months
  • With the generalization of relations we need to look at correspondences within collection
  • Do we want correspondences between objects without collection
  • Is there a rational behind why we have correspondence different from relations? When added we didn't have relations
  • Don't want two different ways of doing the same thing
  • Can we implement correspondence tables
  • When we talk about correspondence table in terms of a classification scheme we mean something very specific. If we generalize it we need to look at this and see if we can implement using relationships
  • do we have the power to realizes a relationship as a specific usage/application
  • Which way do we do this? How do we realize the relationship?
  • We need the requirements so that we can organize this
  • From a binding point of view prefer specific things out and only the common things in
  • Use the correspondence table as a use case for "realizes" mechanism
  • Relations is a pattern and is used by a "realizes" mechanism
  • Might mean a growing of our concept of relations that we may not know of now - example of how you'd do this with certain data storage patterns that were not binary in nature (n-ary relations). Making a more complete conceptualization of relations. There are situations where you can't reduce the n-ary to binary however there are situations where you want to create an n-to-1 relationship (i.e. 5 single year representations to a single coded cohorts). We wanted to reflect hyper-graphs used by some knowledge models. We've basically just been dealing with graph data.
  • Sounds like we're engineering to requirements that we don't know we have yet. We just don't want to model ourselves into a corner. We need to explore this and make sure the model is extensible in that direction.
  • In a binary relationship I can add the A=B. If I have a multiple relationship how do you type the specific parts of the model? Do we know enough to be able to type relations in a hyper-graph situation? We want to keep these separate to prevent misuse of n-ary relations.
  • ORM modeling paradigm states that you can decompose n-ary into a set of binary. Other things more complex relationships are required (i.e. a situation where the overall relation cannot be described as the sum of the binary relations - students in a classroom).
  • n-ary relations cannot be reduced in a hyper-graph, you can with graphs.
  • What we have is solid now and we don't seem to preclude expanding into n-ary relations.
  • Am I limited to a data structure that realizes a binary relation? At the moment yes.
  • However, a data set is a collection not just a relation.
  • A data structure is a collection of data records which are a collection of data points. There is also the structure of a data record (organization of points to create different types of data record).
  • We don't have JUST the binary relation, we have collection which is a collection of relations. How far can we go with this structure besides just binary (as we have collections).
  • What we have seems good but we need an abstract "relation" where Binary is a type of relationship. If you look at collection now it says its structured by a binary relation. If you have a collection which is structured by a relation then we are future proof.

Added DMT-48 to track the following work:

  • Add abstract Relation class and make BinaryRelation a type of Relation
  • Change documentation of collection to reflect this more generic structure.
  • Review where these can be used (Binary can be only be used by Members) -- what is a Member? - Wendy
  • Ease of use of specifying a set of binary relations - right now we have a convenience thing that says everything in the set is equivalent (tree)
  • Collections of nodes and leaves in a tree (we support ragged hierarchies)
  • We may get into trouble with that in classification structures as the levels have relationships that may not be binary. A level is a set of items. That's why it can't be just a binary relation (relations between levels, relations between items may be different hierarchies)
  • Some of this is already taken care of the Neuchâtel model (classification family, etc.) Look at this to see how it all ties together
  • Need a good example to work with.
  • So we need to bring things up to date with work from the last 6 months - see Classification Package how would it realize the Collection Pattern


 Virtual meeting 2016-01-06

ATTENDEES: Wendy, Larry, Oliver, Flavio

Agenda:

Review list of tasks for priorities, note any additional issues

  • Need a means of identifying which objects are exported for a specific view instance  
    • how to map representations between data base representation  
    • If there are no nesting relationships then you need some form of foreign key table in a view  
    • We need to cover all the associations not only between types but between specific instances of classes to each other  
    • We need to have this addressed at the point we put out a really functional view such as a codebook, a data set, a questionnaire, etc.  
    • Write this up more clearly with examples (wlt) 
  • Using named triples - how do you say that a triple you've put in is no longer relevent  
    • Once named you can write triples about the triple - allows the triple to be an object itself (they are quads where each triple has an identifier)
  • How do you say if an association is valid (rather than a valid object) - how do you do this with RDF triples  
    • We use flavors of this RDF approach with UML  
  • RDF usage - Thomas and Brigitte examples - IQ double issue on DDI <http://iassistdata.org/iq/issue/38/4>  
    • Franck Cotton would be good to involve as well as Guillaume  
    • Get a definition of a subset of the world RDF vocabularies so we can get input from our RDF contacts from Dagstuhl  
  • Modeling relations such as applying weights  
    • Possibly remodeling correspondences as relations  
    • What are members and so can be treated as objects for correspondences/relations  
    • Conceptual package for example  
    • Questions could be acts/process steps (are they extensions or realizations of acts)  

For next week:

  • Review Relations and Correspondences to see if there is anything we need to change - contact Jay to see if he is available  
  • Look at how "instantiates" is modeled  
  • When do issues go into JIRA? There is overhead but do they get lost in email? In terms of modeling tool
  • Look at possible cloud repository of Enterprise Architecture (infrastructure needs, cost of seats) what about other tools (see Copenhagen Sprint)  

Oliver not available the next two weeks.

 Virtual meeting 2015-12-16

ATTENDEES: Wendy, Achim, Larry, Flavio, Oliver, Dan G., Dan S.

  • Created a list of activities for 2016 with assignees of primary person and due dates (posted on Modeler's main page)
  • Reviewed issues from Copenhagen, setting priorities and assigning to individuals
  • Next meeting 2016-01-06
 Virtual meeting 2015-12-09

ATTENDEES: Wendy, Larry, Dan G., Johan, Flavio

Discussion of:

  • Sprint work
  • New issues in DMT list
  • Issues about Views and how they are represented
  • Issues about an earlier publication of Codebook (not necessarily complete) to test views and transference of information from existing DDI-C instances

ACTION: File issue for Lion - Although "include in build" is checked on the Edit page it does not appear on the general view for a Functional View  (wlt)

 Virtual meeting 2015-11-11

ATTENDEES: Wendy, Larry, Oliver, Flavio

Moved the changes from Dagstuhl into Drupal  

  • Need to review the model as a totallity but need Arofan and Jay  
  • Need to review Data Description (DD) - check with them if its ready  
  • Larry says DD will review next week then send to DD  
  • Modeling Team should look at how Discovery relates to everything else (basically how the whole thing holds together)  

Do we need a review process? What is being looked at? Use Case application?

 Virtual meeting 2015-11-04

ATTENDEES: Wendy, Larry, Flavio, Oliver, Dan G.

Reviewed issues in DMT and assigned priorities. Issues 13 and 14 were determined as done due to actions of other groups.

 Virtual meeting 2015-09-23

ATTENDEES: Wendy, Dan G., Oliver, Larry

Discusses what needs to be dealt with at Dagstuhl Sprint.

  • Need to push Data Description towards concrete goals for the sprint like the ability to describe a CSV filerials
  • What is available for the Modeling Team to review from Data Description

Issues:

  • Identifier - Reference: There is a definition but no content, Larry created a DDIIdentifier. There is some discussion whether this needs to be modeled in order to specify relations that are weaker then model relations (should this be a property - complex data type)
  • Collection pattern
  • Process model pattern
  • Citation information - how does this work in a view and what other information is needed at the view level
  • Annotation - should this be identifiable? i.e. should it be a property rather than a relation
  • DMT issue list

Presentations:

  • Architecture - suggested that Architecture and Modeling be merged and have Flavio and Jay put together presentation with Flavio presenting
  • Modeling - see above
  • Production process - Marcel and Arofan, should Marcel present?




 Virtual meeting 2015-09-16

ATTENDEES: Arofan, Wendy, Larry, Johan, Dan G.

Qualitative review  of model (see .pdf files on Qualitative page)

  • Where is this? It covers what we had in earlier discussions. Lets you have say a marked up video and find segments and correspond to transcripts. We can attach a lot of metadata to them through annotations. The part that was not consistent with 4 is how we define segments but are able to use custom metadata for this. Can attach other kinds of things using custom key/value pairs.
  • Its complex but the underlying concept is complex. Analytic looks good.
  • Physical aligns with HTML but we may want to revisit just because HTML has been updated. Could SegmentByXML also be defined by an XQuerry? Typo in SegmentBySqlQuery (missing "S" in Sql)
  • Like the design that replaces the 3.2 "Segment" approach
  • Could we test this with use cases defined earlier (past few years)
  • Johan could provide some of rock carvings for use case
  • Get some examples from Cornelia at GESIS
  • Traditional interview use case (video, original transcript, derived transcript)
  • Need to cite a segment
  • Pull together documentation on the use case

Modeling Team needs to set up a group of use cases and get back to the qualitative group to review and test.

Evening session at Dagstuhl to review with outside groups and people who haven't previously looked at the

General

  • We need to organize what the Modeling Team need to work on
  • Priority is collection and order relationship
     
 Virtual meetings 2015-09-02

ATTENDEES: Arofan, Wendy, Flavio, Johan, Larry, Oliver

Order Relationship:

  • For people who can relate to the attributes its not difficult to use
  • We need more domain specific subclasses of relationships that make more sense
  • There is a value in having the subclasses but they need examples in the documentation
  • We think its a good compromise to support a single idea with lots of related terminology
  • Other possible relation patterns include directed graphs and will add to the spreadsheet
  • Larry will update spreadsheet

Make better use of JIRA DMT project

  • Versioning discussion and documents from Mpls (PIM and PDM) needs to be entered in JIRA
  • TC will add to DMT issues as they complete the review of Q1 issues (work on process to make this more timely and to avoid work clashes)
  • Look at issues from London Sprint that may result in issues for DMT
  • DMT-8 Flavio will check on documentation to see if it resolves this issue

Qualitative (Larry sent email and posted proposed model to Qualitative Team page)

  • Larry will update Drupal with a new Qualitative package
 Virtual meetings 2015-08-26

ATTENDEES: Wendy, Flavio, Larry, Oliver

Regrets: Arofan, Jay

Reviewed relations types on spreadsheet adding some additional subclasses to create so that we could cover Allen's Interval Algebra and RCC8 (geospatial relationships). These relationship descriptions will be useful parts of the descriptions as they are expressions that many users are familiar with. The work will be continued based on use cases that arise. For now we could go forward with the example types we have identified after we complete the grid contents for each.


 Virtual meeting 2015-08-19

ATTENDEES: Wendy, Larry, Oliver, Dan G.
Regrets: Arofan, Jay, Flavio

Agenda: Work examples from last week in detail. Larry and Jay

DDI4 and Qualitative Data.pptx

  • Use of relations among files for a Qualitative data sets
  • Novel with alternate endings
  • Describing segments - we could use a set of key/value pairs for those we don't have
  • Seems to be able to capture what's needed
  • If you need to describe segments that the Allen's Algebra works well
  • Seems pretty straight forward, but not quite clear if existing features in the model can handle the use case
  • In terms of segment relations beyond linear relationships; for example what is precedes in overlapping 3-D space unless there is a direction. You need a one-dimensional context in order to use precedes and succeeds (follows).
  • The semantics that define the directionality rather than the properties we have identified (each may or may not be directional)
  • In terms of the relation pairs we need to be able to describe non-directional pairs.

Process Model est _1 v2.pptx

  • Confusing layout in first slide
  • Why is Question a process step? Asking the question is the process step.
  • The diagram is confusing in what it is trying to relate - its not immediately clear
  • Next time Jay is on go over his power point.

Documentation issue to hand to Jon:

  • We're trying to talk about process and then capture it in a data model (just the parts)
  • Use cases need to plug the parts into a process
  • We need a means of expressing Use Cases, Processes, etc. This should be a discussion point for Dagstuhl (pass to Jon's list). For example the UML approach or other consistent means of expressing these things in the documentation.

 ACTIONS:

  • Wendy will inform Jon about discussion regarding Use Cases, etc.
  • Wendy will send out spread sheet of possible relation sub-types for members to populate
 Virtual meeting 2015-08-12

ATTENDEES: Falvio, Wendy, Dan G. Oliver, Johan, Larry, Arofan

  • The equivalent and order relationship was OK but how to represent
  • Current stated options: Controlled vocabulary vs. Strict subtypes
  • Suggested 3rd option: Use subtypes where we know what they are then add one that is managed by a controlled vocabulary
  • Is the model Relations to Controlled Vocabulary - we are discussing only the ones at the bottom (EquivilenceSemantic, OrderSemantic, StrictOrderSemantic)
  • These are properties that describe semantics in other contexts (Equals_To, Same_As, Similar_To, etc.)
  • There are two different types of controlled vocabularies (upper right and bottom)
  • Also need non-transitive types - we identified 3 but not all (not exhaustive)
  • Concerned that we don't think that people outside of core DDI community have the ability to make new DDI classes.
  • The main question is which approach is  best - we need to balance control with flexibility and simplicity with strictness
  • The user needs to understand that a StrictOrder
  • The use of subtypes adds an extra layer of abstraction. they should be working with existing order relations and node hierarchies. If it is insufficient they should ask for extensions of DDI.
  • CV cleaner/simpler to maintain and can be extended by DDI but not by user (they would have to define the relation)
  • One person uses equal_to other uses same_as so while they are all equivalence relationships they are thought of in different terms. The semantics are more about how people think
  • The ease of use for final users should not see much of a difference in the two approaches
  • Relations with specializations v1.1 JPG and Classification and Order Relations with controlled Vocabularies v1.1.
  • What is the impact on the existing structure? not much most are using the center order relation (renaming NodeParentChild and dealing with the pairs). Its in the collection classes but in the classes that use realization. We don't have the special type of association. We need to agree on annotation (dotted lines need to be added to Drupal so we need a the line and label it).
  • This may have some impact on how we implement the XSD. There is still the issue of dealing with multiple inheritance for a number of classes. How to deal with those patterns to make some sort of content-wise single inheritance and implementing patterns.
  • Realize association rather than inheritance. It needs to be implemented but its not such a big deal in terms of the implementations. Nothing here has changed really
  • The way we are referring to this is not implements but realizes. In Drupal do we just create a relationship and name it realize? Yes. Is it possible to have several patterns implemented by the same class? I don't think we have any case of this and should be aware of the issue if we do create a class that realizes several classes due to the problem of name-clashes.
  • Inheritance and Realization (use US English for spelling)
  • We need the examples (from Jay and Larry)

Decision:

  • We are going to follow Relations with Controlled Vocabularies JPG with 3 types of relations currently identified. We can discuss later if we want specific Partition
  • If I want to describe a graph of people publishing together so i have to use Relation and then need a semantic. Otherwise they have to do this with a formal relationship. DDI needs to be completely specified. Can we add this with an internal controlled vocabulary.
  • Dan is talking about having a semanitcs attribute that points to a specivfic external controlled vocabulary published by the DDI so that it can be updated on a separate publication cycle.
  • We will be able to enumerate the number of subtypes (there is a finite number). But there is a semantic behind that set of properties behind each type and these are not exhaustive.
  • The semantics themselves need to be controlled.
  • Enumerate all the possible types based on the specified properties.
  • There is a semantic associated with each subtype which should be a class. The objects
  • Model as presented using the simple class for the semantic types. Review the location of semantics (internal/external) and make an issue of review.


 Virtual meeting 2015-08-05
ATTENDEES:  Arofan, Wendy, Dan G., Oliver, Jay, Johan, Larry, Flavio

Agenda

  • controlled vocabulary - still pending
  • collections

Jay modeled a sampling methodology based on Dan G. Sampling Plan Design

  • The relationship subtypes are not exhaustive so made up a new order relation. Used SampleOverview, State Sample, Stage. Then added a state relation pair with stage relation pairs. SampleOverview is a Design and inherits from the Design class in Methodology. and instantiated with Sampling Plan. See Sampling Instance power point
  • Question regarding relation on slide 4 Not sure how "followed by" semantically captures the relationship between the two StageSamples (Physician Practice and Hospital)
  • Question: When you are talking about Wednesday following Monday and Friday following Wednesday only implies they are samples
  • Question: You need a set of things to build a relation. Is there a set here?
  • I thought there was a set of stages and set of stage samples. At one level you have a set of stages in a multi-stage sample and some order relation between those stages. There is also a set of state Samples that are ordered in particular way.
  • So you are ordering stage samples not doctors or patients.
  • When you look at the relationship between doctors and patients and hospitals.
  • At any stage you might divide the frame into multiple samples (Doctor Practices, Hospital)
  • This is a two stage sample which require two separate samplings (Doctor Practices and Hospital). These need to be sequenced back up.
  • Dependency chain - Need to identify the Physician Practice and Hospitals in order to select patients.
  • Think we could do something much more complicated and real with this approach as it does seem in line with what Arofan was saying (i.e. relationship between Hospital and New Patient Admissions being a "contained" relationship)
  • Are you claiming you can do more than Dan's original model? Oh no!
  • "Follows" didn't capture the additional semantic of the days of patient sample
  • The machinery to describe the staging as these are perfectly sequential by nature. So in the spec as originally written has a stage number which is all you need. There are dependencies among the samples up and down the stage ladder
  • At the stage sample level there are additional semantic and there is some agreement that this is needed. Thinking there were only relationships between stages, but there is a need for relationship between the StageSamples with the Stage.
  • This may have not been the right use case to test this on but it was useful in looking at the use of the collections pattern as a means of expressing this. It provides a way of testing these things that seem to conform to collections pattern.
  • How confounding would this be to the Business modeler? Is it as accessible as we need it to be?
  • I think that without some domain notion of what we are talking about this is hard to understand. But it should be accompanied by documentation and examples. Need to be clear what is the set here (its the sample rather than the contents. In terms of using the collection pattern is not necessary but it provided as a way to generalize so that we can reuse algorithms etc.
  • For example clinical trials is not sample set but people are included by event. So we need to understand inference. There is real world stuff going on that may violate some of these pattern assumptions. Do we want to take this tact with the model Dan put together.
  • We are not in a position to make decisions yet.
  • One thing I (Jay) want to add. The semantics in Dan's original proposal was not sufficient. I only discovered this by using the collection pattern which enabled me to drill down and think through relations between stages and samples in a granular way. To me this is the power and expressiveness of collections pattern. The question (and challenge) I have after our discussion on Wednesday is what can Dan do to make his model more expressive short of using the collections pattern?

Flavio work on Classification and order relations

  • There is a separation of where we talk about a set of pairs so we don't have to repeat it on every pair. In some circumstances a particular pair may have different semantics so do we want to reuse pairs? No. should be black diamond between the relation and pair. If listed as transitive you would not have to list every pair. Which is the case of this example. The semantic is DecendentOf.
  • When you think about the number of things you'd have if you didn't have the separation you'd have to describe a huge number of pairs.
  • Question about the term parent-child in the sense of this being transitive. At the specific pair level its true. Conflicts with transitive. We have to really careful of the semantics we state so it may be better to change terminology of parent-child (doesn't imply transitivity).
  • This is an expedient that can cause confusion. In the custom model we allowed for expressing the semantic of the relationship.
  • Could call it NodeHierarchy and NodeHierarchyPair and then define the semantic. The hierarchy is structural and this would allow us to separate the words (semantics) from the structure.
  • When we are implementing patterns we are trying to provide something real and known to the processor
  • Its very possible that you are accounting for everything you know but you may not know everything.
  • We have to be careful because any single set of pairs we need to have the parent-child relationships because otherwise we are not guaranteed to get a tree.
  • There is a good way to think about this. You don't have to worry about calling this parent/child because in a hierarchy you think about this as a tree with adjacent nodes. When we implement a pattern in our own model for a specific purpose we need to declare the relationship clearly. We may have cases where we need looseness but lets keep these separate.
  • Node hierarchy representing relation and then contains a node hierarchy pair with contains node parent-child pair.
  • need to allow for inference and explicitly stated relationships. Would the node hierarchy pair be abstract? No, but do we want to distinguish between the two (hierarchy and parent-child). What of this set of objects is actually abstract (pink, others are in the model). Why wouldn't the green objects abstract with nodes acting as types. NodeSet and Node have to be consistent. In defining the set you must specify that something is complete.
  • See slide 3 for relationships with specification using subclasses. This can be used (see slide 6) creating a DecendantOf subtype. Dan's controlled vocabulary would plug into the subtype.
  • Having to change the model is an advantage which forces us to examine each time. This could slow down a particular implementation unless we support the unknown type.

Larry - Order Relations Qualitative Files

  • Could use relation pairs to describe chapters (ordered),
  • Could describe alternate endings into sequence as equivalent pairs
  • Another example would be equivalents of video or audio with text.
  • Good example but needs to be explored more to cover transitive levels and non-transitive

NOTE: Oliver will send something on schema creation by the end of the week.

Agenda items for next week:

  • Back to this in a more abstract form rather than specific examples
  • Qualitative example
  • Schema Creation
 
 Virtual Meeting 2015-07-29

Modeling Meeting Minutes 2015-07-29

Looking at Flavio’s model for relations (see below)

Four properties
Totality
Reflexitivity
Symmetry
Transitivity

Neither is needed because some apply to all members of the set. Do we need a fourth option of “unknown” vs all attributes optional? Unknown would be unassigned.

We know some subclasses  (strictOrderRelation)

Neither might include case where some elements are transitive and others are not.

Apply an inference engine to  DDI information.  Example a<b and b<c  and the  relation described as transitive then can infer a<c. If transitive “neither” than only ground facts known (a<b and b<c  )

How usable is this model?

Should we have attributes required but default to unknown?

Some relations that are not transitive: Literal use of “parent-child” , “immediately before” , “instance of” all not transitive.

A confusion point is - order between pairs vs order across whole set.

Two cases
1)    a hierarchy where all members are the same
2)    pairs differ parent-child,  instance of

Do we want to support an inference engine?  This is very complicated particularly when looking at Allen’s interval algebra- 18(?) Sub-types of Allen’s algebra?

With RDF we will want to support inference. A goal would be to make it easy for users but allow for more well defined relations.

Does putting this into XML and RDF make for complexity

Where is precedes?  Can we possibly enumerate all the possible semantics?
Maintain a controlled vocabulary?

Types vs semantics. The various semantics may fall into a smaller set of types.


Action:  take this model an apply to some cases.
Relations among files (Larry), Other collections in Drupal. Classification (Flavio) Process model (Jay)

Can we use “realizes”

Flavio's model as discussed above:

 Virtual meeting 2015-07-22

ATTENDEES: Arofan, Achim, Flavio, Johan, Larry, Wendy, Jay, Dan G.

Site update - reorganization has been done but need to review items in the JIRA list from London and document locations

Ordering:

how do we go about structurally talking ordering options

fall into the usage of subtypes or attributes or combination of the two

Attributes:

  • Meaning of the ordering
  • Set the ordering is applied to
  • Indicator if ordering is reflexive or strict
  • Indicator if the ordering is total (linear) or partial (some pairs of elements are not comparable)

If you are declaring an ordering it is transitive

Currently this information is related in different ways in the model

Do we want to change what is there? add, remove? ( emails from Flavio and Jay)

We want consistency

  • OrderRelations also have 4 dimensions as properties.
  • isPartial (yes, no, unspecified)
  • isTransitive (yes, no, neither?, unspecified)
  • IsSymmetric (yes, no, neither?, unspecified)
  • IsReflexive (yes, no, neither?, unspecified)

Five options were listed in minutes of 2015-07-08

Alternatives:

(1)    We have a single OrderRelation class in the pattern, and rely on its definition when realized to determine whether it is partial, transitive, reflexive, etc. The review process would guarantee that definitions are rigorous.

(2)    We expand OrderRelation in the pattern with a set of useful sub-classes, which specify transitivity, reflexivity, etc., and these are the classes in the pattern which are realized instead of OrderRelation.

(3)    A variation of 2: We have the OrderRelation class, as well as completely defined useful subclasses, but we allow any of them to be realization points in the pattern. People who don’t understand the subclasses would just realize OrderRelation directly, and the review process would correct any misuse.

(4)    We have a class OrderRelation with a set of attribute which specify whole/part, transitivity, reflexivity, etc.

(5)    Variation of 4: We have these attributes on OrderRelation, but we do not require that they be specified.

Discussion:

  • We want to be formally complete in how we model things.
  • We have the collections document which we should review and have it made part of the business modelers guidelines 
  • Only thing we are suggesting relaxing is reflexivity
  • All other relationships are Symmetric and Transitive
  • Parent/child is defined by ancestral decendent rather than Parent/child
  • Your tree is built of a set of transitive relationship
  • Scissors, paper, stone are simply relations NOT order relations
  • Do we need to separate simple "relations" from ordering?
  • Demonstrated that a mix of pairs in an instance is not an order relationship so Jay made a relation as well as an order relationship.
  • What is in the model now is an abuse of order relations to expressing relations
  • Most of the ones we have in the model are order relations
  • We need order relations explicitly in the model. Also need a more generic relation which no definition of attributes, but information on equivilency.

We need relation object (assign attributes as optional things according to discussion results)

  • Reflexivity
  • Symmetry
  • Transitivity
  • Total/Partial

Order relations is a subclass that are antiSymmetric and Transitive (others option


Equivilence relation:

Symetric, Reflexive and Transitive (can be total or partial - is this important, trivial?)

Some attributes are constant in a given class

Wrapping up:

  • Need to review the collections document
  • Are we missing directed relationship? ordering creates a directive class
  • But an directive simple relation? Symmetry implies have your cases (no direction). The semantic implies the direction. Another type of relation may have a direction.
  • Subclasses would fix the values
    • Order
    • Equivalence
    • Relation (non-equivalent and non-order, directed cyclic graphs, etc.)
  • On relation itself, would be a binary with a source and a target (works well for directional) describe use for non-directional as in 3.2 (either can be source or target)
  • Set the attributes as required
  • Keep terse and clean and deal with understanding through documentation
  • Go back to collections document, extract what is useful, and create a document which can be incorporated into modeler's guidelines and high level documentation for model
  • Relation should be a pattern that is realized in the model. I could realize a relation class that is neither order or equivilence. Propose a controlled vocabulary of relationships we have defined "part of", "linear order", etc. There is a GSIM list we can work from. (ACTION: Dan)
  • How do we manage differences within different classes. Lets get the list and then figure this out.
  • ACTION: Create a model in a scratch document (Flavio) 
 Virtual meeting 2015-07-15


Notes from Modeler’s meeting 2015-07-15

Attending: Arofan, Larry, Dan G., Wendy, Flavio, Larry


OrderRelations have semantic types:

Currently: Sequence, Parent-child, Part-whole, Relationship

These semantic types are the basis for extensions of the abstract OrderRelation class.


OrderRelations also have 4 dimensions as properties.

isPartial (yes, no, unspecified)

isTransitive (yes, no, neither?, unspecified)

IsSymmetric (yes, no, neither?, unspecified)

IsReflexive (yes, no, neither?, unspecified)

There was some question as to whether the neither option is needed. We need to be clear about what these options mean. Would (always, never, neither) be clearer?

For some order relations no transitive relationship is true:

Scissor>Paper  

Paper>Stone

Stone>Scissor

Scissor>Paper and Paper>Stone but not Scissor>Stone

Paper>Stone and Stone>Scissor but not Paper>Scissor

Stone>Scissor and Scissor>Paper but not Stone>Paper

In other cases some might be true and some not (e.g. sporting results):

  1. a beats b
  2. b beats c
  3. a beats c
  4. a beats d
  5. d beats e
  6. e beats a


Modelers developing classes might set a fixed value on a dimension (e.g. Parent-child is not transitive) or leave the option up to end-users (a sequence might be required to be transitive or not).

Default is unspecified (value is required)


Dan will lay out useful combinations.



 Virtual Meeting 2015-07-08

2015-07-08 Modeling Meeting

Attending: Flavio Rizzolo, Arofan Gregory, Dan Gilman, Jay Greenfield, Larry Hoyle, Olaf Olsson

Should we use “realizes” for classes like OrderRelation (using them as interfaces) rather than as inheritance?  We need to be able to annotate that this is an interface in Drupal.

The pattern is set out with a relationship to another class as a realization of an interface. With an interface the only use is through a realization. It can’t be extended by a class.

Is this a problem?


First we should describe a set of Boolean attributes to define the sub-types of order relations.

Then the question is whether to do this through properties vs inherited classes

Total-partial

Transitive-not

Reflexive-not

(Sequence, Parent-child, Part-whole, Relationship)

Not all combinations are possible.


Will setting these as properties be very difficult for users?


Issues

How we define the model

The rules we put forth

What does it look like to an implementer?


Making things explicit puts constraints on use (avoids people breaking things).

Some of this modeling will take place in the transition between Platform Independent to Platform Specific.

In RDF the names of the predicates imply (through definition) all of the properties.

Alternatives:

(1)    We have a single OrderRelation class in the pattern, and rely on its definition when realized to determine whether it is partial, transitive, reflexive, etc. The review process would guarantee that definitions are rigorous.

(2)    We expand OrderRelation in the pattern with a set of useful sub-classes, which specify transitivity, reflexivity, etc., and these are the classes in the pattern which are realized instead of OrderRelation.

(3)    A variation of 2: We have the OrderRelation class, as well as completely defined useful subclasses, but we allow any of them to be realization points in the pattern. People who don’t understand the subclasses would just realize OrderRelation directly, and the review process would correct any misuse.

(4)    We have a class OrderRelation with a set of attribute which specify whole/part, transitivity, reflexivity, etc.

(5)    Variation of 4: We have these attributes on OrderRelation, but we do not require that they be specified.


Arofan suggested that 1 and 5 above may be the best choices.

Too much specificity will be a turn-off for some users , others will want the specificity



Separate issue:

Is equality an order relation – no



 Virtual meeting 2015-07-01

ATTENDEES: Wendy, Olof, Larry

Sphinx

Reviewed Sphinx and discussed use in building the documentation for DDI 4 (comments)

https://ddi4.readthedocs.org/en/latest/About/documentationSyntax.html

  • Structures like table were intuitive and easier to accomplish than say HTML
  • Content would be easy to transfer to another format if needed in the future (content such as table, glossary, lists etc. identified as blocks of content with few embedded tags)
  • Content can be managed within repository along side output of Drupal for automated processing to final document
  • With local install rebuild is automatic upon editing creating a visual display of current entry work (easier for less experienced users)

Olof will set up some shared screen time with Jon to go over the structure and tags.

Modeling Team should look at this as a possible means of creating Modeling Guidelines and keeping them available and current. This issue will be raised by TC as a result of review comments.

On-going question is what goes into Drupal and what goes into these documents in order to facilitate a smooth build process for all products, for example XML examples and RDF or other binding type examples.

BINDINGS:

The question of RDF production was raised. Olof suggested that one means of addressing this would be to offer a small grant to create the initial RDF mapping and syntax binding work. Wendy will bring this idea to the AG.



 Virtual meeting 2015-06-24

ATTENDEES: Wendy, Flavio, Jay, Oliver, Olof, Larry

Agenda: Methodology and Core Process

Core Process is still being reviewed particularly in terms of Input, Output and binding

Methodology in Lion:

  • There are parts of the model that are pretty mature and are understood by others (ex. collections). Would like to get process model to same stage.
  • Method is a set of principles, tools and practices which be used to guide processes to achieve a particular goal. Method is abstract. Seems to line up well with GSIM. Inputs and Outputs are not to be confused with goals. There is no correspondence and in is ine with BPMN/BPEL. Inputs and outputs are part of the message flow and are limited to data and programs.
  • One swim lane is for preconditions and methods and allows us to talk about a sequence or flow in a business process where one can be a precondition of another.
  • We are integrating the process model with process.
  • Relationship between a Process and Process Step (extends a Process)
  • You can represent a Process in a more or less granular way. You don't have to break it down into process steps but you can.
  • Q: Looks like all the inputs and outputs are at the step level and not viewable from the higher level?
  • We need to be able to describe at both levels so we have two swim lanes by separating the what from the how (what is process, how is process steps). The idea was to come up with an approach that worked the way software does, but we need
  • The Sequence order relation can be a partial so allows parallel activity.
  • What we're trying to work into the model is the collection theme. Not entirely worked out.
  • We need to test this out.
  • Flavio and Jay need to discuss off-line to maybe simplify a bit more in terms of inheritance line
  • Take a methodology Use Case to take through both Methodology and Core Process. View through two different users.
  • Once this is revised Jay would like to write a document that does a bit of testing and then everyone goes after. Perhaps in the Methodology group.
  • What we didn't have a lot of agreement on was extension points like Sampling. How these additional objects are introduced.
  • This suggests a two step approach. Testing by modeling team and then test the more matured model by the methodology group and experimental design.
  • Need to retain close ties with the Methodology group to keep these aligned. Experimental design needs to be brought into this also. Wendy will contact Michelle regarding this work and the work of the Methodology group.
  • This will be a lot of work but will be very important.


How to go about cleaning up the model:

  • TC is responsible for reviewing and recommending solutions. Modelers implement.
  • Over the summer we want to:
    •  resolve and close those things that are focused changes 
    •  complete work on specific issue 
  •  prepare a rules set for use of ComplexDataTypes, Primitives, and standard structures
  • Then do a general review of the model and how it holds together
  • Top to bottom review during Dagstuhl
  • Get the rules set up before hand. The changes should be done by very few people.
    • The fewer people who make changes in the model the better
    • New stuff should be added by people doing the work
    • Changes to things that are there need to be handled by a small group to avoid cascade changes
  • Need a short training session at the sprint for use of tools - Lion and Atlassian
  • Short crib sheet for use during the sprint
  • Wendy will bring together and put on a wiki page


NEXT WEEK:

Sphinx structure - install instructions

People review and be prepared to discuss any questions next week

https://ddi4.readthedocs.org/en/latest/About/documentationSyntax.html

Live screen demo by Olof

IN TWO WEEKS:

Update on Collection and Process Model

VACATIONS:

Oliver July 1, 8, 15
Flavio  July 1
Wendy July 8, 29

 Virtual meeting 2015-06-17

ATTENDEES: Wendy, Achim, Larry, Jay, Oliver, Dan G., Flavio

Update on what's going on with the Qualitative model (separate groups) Implications for defining a segment in text and a physical object

Agenda:

Flavio's document on Views

  • 4 options - walked through the paper describing options and issues with each
    • Do we want #4 as this supports addition of content that is not specifically in the packages.
    • Option 2 seems practical in terms of managing through transformations
      • Minneapolis discussion - plateform independent and plateform specific model (specific to binding)
      • Restrictions could be defined in the mapping rules.
    • Option 3 sounds like what was discussed in Minneapolis
    • Drupal in a sense is doing 2 by identifying objects external to a package
    • The problem of Option 1 is that you end up with a view that is much larger than what you want in the model.
    • We want to have restrictions of some sort. Approaches 2 and 3
    • Subcase of 2 is only allowing restrictions of thing outside of the view
    • What is the danger of approach 2 or 3 from an interoperability perspective
      • Example: Class A has relationships to X Y Z in package In View 1 only Y and Z
      • Dan Smith also mentioned being able to harvest from multiple views and would need to deal with what was available in each view
      • In the Java example you are talking approach 4 because Java doesn't support restrictions. You end up creating a superclass. With 2 you end up with 2 superclasses and multiple inheritance

Mechanism for expressing the view and the XML bindings

  • A class is realized in the same manner in different view for the purpose of interoperability
  • Indicate when a property/relationship is NOT used with a related serialization without these passive properties/relationships
  • Should we support different restrictions of the same class in different views.
  • Provide view specific listing of what classes are include, which are restrictions, so developers can grab full class for modeling purposes and indicate which properties/relationships are NOT used by a view

Need to check these process out in different bindings (XML, RDF, Java)

  • What example should we use and who will describe it? Wendy will identify some classes from Codebook for which they would want more limited views. Will send out.
  • Flavio will look at process model next week and see what's there and how it might relate to this issue.


Next week: Jay and Flavio's work on methodology and core process

 Virtual meeting 2015-06-10

ATTENDEES: Wendy, Flavio, Dan G., Oliver, Jay, Johan, Larry

Issues raised by Codebook:

  • Providing a consistent means of providing rationale
  • Recording number of significant digits (accuracy, precisions) - can change at various stages
  • Number of available storage positions is precision
  • Having 20 digits of precision is not necessarily meaningful - this is based on the statistical process

Low lying issues:

  • Bindings -
    • Currently about to implement decisions from Sprint (trying to get this into his work schedule, will have a better idea in a few weeks)
  • Next week Jay and Flavio have been working on methodology and core process - significant modifications are being made to the core process
    • Wendy will review Q1 comments for related issues
    • Data description modeling piece - finalize structure and relate to that package

Definition of a view - Flavio document

  • Distributed for discussion next week (will post when team pages are reorganized)

Layout of Modelling Team pages

  • Distributed (please make comments via email)

Standard information to have in views

  • Support for standard definition of metamodel in everything
  • What goes into the view (ability to restrict the contents of a class?) How is this implemented in bindings?
  • Are there standard classes that should be in views (i.e. coverage) if applicable to the business case?

NEXT Weeks Agenda:

  • Flavio's document on Views
  • Jay and Flavio's work on methodology and core process

NOTE: Johan on vacation from June 14 - July 19 

 Notes from 2015-05-26 Sprint Discussion

Sprint 2015-05-26

Versioning discussion 2015-05-26


Release policy?

Frequency Intervals  of releases?

How to implement?

What is released?

What is in XMI? What is binding?

Same view different versions of classes?

Different Views have different versions of a given class?



Interest groups what would there release policy preference be?:

Motivated Developer 

Normal Developer 

Management  

Researchers?


Release numbers may be important for software providers to be able to describe compatibility


Rule Any official release of a view must be consistent with all other views. A class in one view needs to have  the same version as that class in the other views.



Separate association to external objects from a view (Flavio)?



Release policy?

1)      Big bang 

2)      Small pieces (some views) with affected views (all dependencies when calluses not backward compatible)

3)      Only release changed views – allow inconsistency among views

4)      Big bang releases , minor releases, bug fixes, nightly builds



When PIM interval changes change a version number in the PSM

What is backward compatibility –  instances stay valid or only previously required properties still there?



Stable namespace?



Time intervals are a good idea, is it possible to implement?





 Virtual meeting 20 May 2015

ATTENDEES: Arofan, Wendy, Dan G., Johan, Larry, Jay, Jannik, Flavio

Sprint notes:

  • Everyone but Jannik will be at Sprint (we can set up conference if needed to include him one day)
  • Production has been scheduled for time daily during the sprint

Frame work from last week and envision what release packages would like

Versioning in the sense of a View:

  • We've decoupled the versioning of the objects in the model from the release
  • Create a snapshot of a point in time
  • In the bindings when you have different objects with the same names does this mean we require a version number
  • How do we name new versions of each class?
  • What is bound are views so if they don't change are they reissued?
  • The problem is then the Uberview.
  • We probably have to wrestle with what we're going to call things when we version a class by adding a property or relation
  • What rules do we use? When does it get a new name? Without a name change things can get complicated at the level of the view.
  • You have to rename the class if either a property is added or relationship is added.
  • What do we mean by changing the name? In the past we have used namespaces to manage this
  • What is the scope of the namespace? One per View one(or two) for the Library.
  • You could not change the name of the class but the transformations that produce the bindings could use rules to determine if the class has changed.
  • Separating the name from the version seems easier to describe the issue. If you use the date in the namespace then you know the version of the class that is used.
  • One policy is to say the things in the library are given a namespace corresponding with the date of release, same with the views.
  • Is it OK to have a class appear in more than one name-space.
  • If a new view has a new namespace then a class can have multiple namespaces.
  • A view points to a specific class (version) and so you need to know the namespace of the class used.
  • If you consider the identity of an class is its name and version and that is what you point to this is not complicated.
  • Since you never use 2 views simultaneously you wouldn't. Classes live in the Library NOT in a View (a list of references to named classed)
  • The discussion during the developers meeting was that all namespaces were removed and just go with DDI. Can we meet that requirement?
  • How do we reference classes? Just by the name or name/version (as separate).
  • Actually we want to reference by its identifier which is a combination of name + version.

PROPOSED: What we propose to have in Drupal is that we keep track of every time period for a class. When we release its a 2.1 associated with a point in time. What we have in the release is identifiers (version qualified name). What we want to expose in the binding is the NAME. One proposal is that when you version a View the class is the version number of the View. This is a problem when you have multiple views with one or more versions all using the same class by reference but now with multiple namespaces.

Discussion:

  • The thing we need to remember is that what is in the repository is version proof. There is no penalty for adding things.
  • What matters most is that we need to see the View as binding
  • Can we provide through the binding program that provides the version and valid dates
  • Its ugly but you could put it in a separate set of objects - like a preface in each that lists the classes with their time spans

Walked through the process:

  • Get CodebookView A which has a listing of Classes and their version
  • Get another view (different version or different view) with similar list and then compare versions of classes from the list
  • We need to provide something to query and process a structure that is consistent with XMI but more compatible.
  • You could have an RDF representation of a particular snapshot.
  • I have a processable structure with their classes, validity dates, and diffs from different version
  • How much can be done in RDF, XML, and XMI?
  • RDF validation initiatives are taking place on how to validate RDF...how do you know when you have a complete picture. However you can provide a graph of what is considered a complete picture.
  • May need to do some research in RDF validation work.

Many types of users:

  • Modelers - won't have to know squat
  • XML / RDF readers - who understand the expressed binding
  • Developer who is just using the current version of the view - needs to know what versions are being used for each class
  • Developer who doesn't care - shouldn't have to be bothered with the life history of classes
  • Content person - needs to know when the boxes change

PROPOSAL

  • Snap shot has to performed
  • List of contained classes - with version and/or time span
  • Limit to only version of a class in a single view
  • Any given name can only have one structure (version) of a class in each view
  • Doesn't the time stamp picture take care of the problem of relationships over time?
  • Require consistencies within the class:
    • If a property is removed I can't use the later version
    • If a property is added I can still use a later version
  • Only one version of any given class can be used by a view
  • If you say a particular view has a version associated with it you have a particular versioned class
  • If you can't remove required things
  • Only modify/version by appending
  • Create new extended objects to keep simple things simple

Need change rules:

  • Change of class
  • Extension of a class
  • The library has to be consistent
  • The View needs single version of a class
  • Need binding rules to make this work
  • In XML you'd say have an InstanceVariable but a listing in the header that lists Class Version Time/Span
  • Versions at the level of the View notates the point in time you were bound off the model.
  • We need the consistency rules. We need a way to validate and check prior to publication.
  • We probably need to work through the problem incrementally, but we have the benefit of similar scenarios to base the rules on.
  • You can't delete something that is still being used.
  • The patterns of things being used together is where the problem lies.

Experiment, look at the patterns and use these to generate rules. Example, the change from Universe to Population. If the relationship changes as of a certain time you will need to create a new class if you need both content.

  • Take Flavio's example of measure and population. How would this look in two view (early and later) and then another view that is combining classes which each use a different version.
  • When we work through a couple of examples it will become clearer where the complications arise.

ACTION: Whoever wants to should do it so we have several examples. Flavio, Wendy, plus. Include rules for defining change (version, extension, new class),

Additional comments:

  • Corner cases are fine but we need a more simplistic approach to deal with the majority of cases.
  • A view is what you implement so you have simplicity. Would not release a view that includes two different versions of a single class. Prefer that there were no versioning issues. Having the model version in every object would be hard because there would be multiple identifiers.


 Virtual meeting 13 May 2015

ATTENDEES: Arofan, Wendy, Dan G., Oliver, Jay, Johan, Larry, Flavio, some Mexican parrot

Announcements

  • GENEVA MEETING: Positive response to DDI 4, separation of the sentinel value in Variable
  • Jay: Refresh the NCI's CADFR (metadata repository); referenced variable cascade; referenced Flavio's approach to tracking the evolution of data elements; how big data can use this information (should be able to distribute RFI)
  • Changed 'Ask the modelers a question" to "Ask the modeling team to address an issue"

Evolution of classes

  • Flavio's document and email discussion (see email trail at end of meeting notes)
  • Jay:  This is a graph problem where we could figure out a means of traversal to inform and define the versioning. Where we could avoid some rabbit holes.
  • Flavio:
  • Based on work done in 2008-09 plus a lot of related earlier work
  • Model temporal association between classes and versions of classes
  • Temporal interval used similar to ISO 11179, StatsCan, etc.; partially implemented in NIH repository
  • Walk through of paper -
    • Currently in Drupal this change results in loss of T0 InstanceVariable and replacement by T1 InstanceVariable
    • If you assume that ability to change/add an association without effecting the class
    • Is it possible to represent all of these classes with changes at each point in time
    • Using time spans to track change
    • Building up the ordered pair reflects the time interval when something is valid
    • Can add evolution associations to describe the change
    • You can reconstruct any version you want by specifying the time
    • Maintains a consistency so that if you want a class at a particular time you will have those
    • Every time you change the model you need to be sure the consistency is maintained in the relationship
    • People may be looking for a particular functionality rather than a time interval
    • It does seem that when someone wnats to figure out which vresion of a view to use we need to be able to unequivically provide direction so the tooling is important
    • Implemented the intervals but not the evolution. It was actually lightweight because the intervals were 6x9. Issue is not space but searching. The right indexes make this work.
    • In terms of volume of information would it be datetime or just date. The date could be associated to the release date as opposed to the change. So we could capture the change date but consolidating the release with the datatime of the release. What does a release version consist of? Should it be a snapshort or an interval. You need to include in a new release a cononical model which covers. The problem is a semanitic one as it takes place at a point in time. The next release doesn't invalidate the former release so is there a point in keeping a an interval in the release date. That system existed in time interval X. During a particular time interval there are these properties. What is the inerval (start and end)? Which is the start and which is the end?
    • This is a description of the model as a whole and how it changes over time. The release and the time interval are two separate ideas. The release is valid from its release date until it is deprecated regardless of the number of releases that follow it. The intervals are just for the conceptual model.
    • This seems like a good framework for looking at the concept of versioning. On the last call it was pointed out that we still need to determine what is a version change and what creates a new object. The issue of an extension can be usefully defined as an issue of conformity. Conformance is satisfying all requirements. Strict conformance is satisfying the requirements and nothing else.
    • Figure 5 shows the model with the evolution associations "becomes". Otherwise there is no way to understand that Population came from Universe. Substitutions will be valid in a view based on the time stamp of what were available at that point in time.
    • Figure 6 shows the model where the cascading effect doesn't take place. You have to check the immediate "neighborhood" for consistency but not further. Need to be able to prevent use of deprecated (an effective "deletion" for future use).
  • This needs to be part of the consistency check for release.
  • How do you see the views? If you want to maintain the view over time, they are references to the class.
  • Could manage the view as an implementation release.
  • Classify specific changes
  • What do our releases look like in their entirety?
  • How to represent this in the XML? For any given release you have a configuration at a point in time. Just paint the picture that is the view. Its a snapshot.
  • But what you need is an RDF representation of the temporal model to allow people to understand what we're doing over time. This means the information is available but separate. Uberview is current objects.
  • One thing we have to be careful about is that it is going to be difficult to find an old version of an object that has associations because.
  • You need to get away from this Uberview.
  • Extension can be a specialization.
  • Extension could be defined as narrowing or broadening
  • How do we deal with Uberview as an implementation?

ACTION: Requested that Johan take this discussion back to Olof for him to consider implications for Drupal.


Agenda for next week:

  • Go through the exercise of tracking exactly what the release process and content would look like


EMAIL DIALOG REGARDING PAPER
Larry:
A question: Would the "becomes" (meta?) relationship attach to the earlier incarnation of a class? Does this change it? In the canonical model (figure 6) Universe continues to exist, but also becomes something at t2.  Would it be better to model from the later class - e.g. Population "comes from" Universe?

I’ve been thinking about the implications of the implication of our versioning approach for developers trying to handle importing DDI-MD as it evolves. I think that we will want to have much stronger recommendations and "How to" documentation for those producing export instances of DDI intended for import by others.


Flavio
The "becomes" is indeed a relationship in the temporal metamodel, so it shouldn't really change any class. You could also view it as a mapping between two or more classes. In fact, you could almost represent it that way with a specialized temporal "entity" that keeps track of other entities that change in the model, much in the same way that OrderMemberCorrespondence works for Collections.

 In fact, I believe that if we assume that changes to relationships don't change the owner classes in any circumstance, then  the model is simplified considerably given that you avoid the cascade effect described in the document. That is what I called the canonical temporal (or evolution) model.

 The "becomes" is in fact the same as a "comes from" just pointing in the opposite direction, so we could use either. For richer temporal semantics, the paper I referenced has the notions of "split", "merge", "detach" and "join", defined from the "becames", a partitive relationship and the temporal intervals of the participating classes, and they can be reversed as well.

 Virtual meeting 6 may 2015 NO MEETING

Cancelled due to meeting conflicts - Jay and Flavio will use the time to work up the example we dscussed during this week’s meeting.

 Virtual meeting 29 April 2015

Attendees: Wendy Thomas, Flavio Rizzo, Jay Greenfield, Oliver Hopt, Arofan Gregory, Dan Gillman

First Issue: Versioning and effect on bindings

- Discussion at NADDI lead to the comment that the current discussion of Versioning (as summarized in Achim's paper of a few weeks ago) could produce some difficulties with the binding in XML and RDF, such as having to put version numbers into the element names. A more elegant binding for this needs to be determined. Discussion of this lead to a broader discussion about how versioning would work in reality.

- Consensus seemed to be that this is a "graph problem" and thus should be subject to a similar "graph solution" - this would consist of an object having a relationship to the object of which it is a revision.

- Jay and Flavio volunteered to come up with an example, so that we could try to apply different rules in a more ralistic way, and to see what impact this would hav on the bindings.

Second Issue: Release of the bindings for Q1

- When will we be ready to release the bindings for the Q1 material? For the XML, we think this will be soon. Wendy will determine the date on which Jon produced the documentation for hat has already been released, and Oliver will apply the latest XML bindings to the XMI from that date. He will circulate to the group. Once reviewed, the schemas will go to Achim so he can produce the clickable HTML field level documentation.

- There is still a question about the RDF binding and how it would be documented. Arofan will circulate to the group the e-mail discussion with Franc Cotton, which might suggest some useful ways to document the RDF for review. We need also to check with Achim that the OWL is ready for review.

- There was some discussion about the changes which may result from the versioning discussion to the current bindings, and whether these were significant enough to hold up release of the bindings. General feeling was that we should not let the versioning discussion hold up the relase.

- There was a strong comment that we needed to explain to reviewers how different the design of the XML will be. This could be incorporated into the User's Guide, which Jon is still working on, but which has not yet been released.

- The timing of the bindings release vis-a-vis other releases for review will be raised by Wendy in both the TC and AG meetings.

- It was decided that the binding release was an additive part of the Q1 release- "Q1b", and should be treated that way as much as possible.

Third Issue: When do we un-freeze Drupal?

- We decided to keep Drupal frozen during the Q1 review period, so that it stays in sync with the material now being reviewed.

 Virtual meeting 15 April 2015

Attendees: Wendy, Olof, Larry, Johan, Jay, Jannik, Oliver, Flavio, Arofan

NOTE: Olof has asked to attend these meetings to assist in the maintenance of Drupal. We're glad he can join us!

Versioning:

  • Discussion with Olof about options within Drupal. He will think about the possibilities.

Schemas:

  • A lot of issues are fixed
  • Substitution groups are still an issue
  • Complex data types as attributes
  • CodeValueTypeType etc.

Customized Metadata - unforseen metadata

  • Creating a metamodel
  • Showed work from Madison about a metamodel
  • Custom metadata structure also be used to express DDI controlled vocabularies
  • Report is a use of the instance of the structure to report custom metadata
  • Only allowed in specific areas
  • Datum discussion about context - this would allow context specific information that we don't know in advance
  • Jay would like to look at this in context of his paper describing design patterns in UML profile - it may provide some power to this model
  • Arofan will try to find time to write up (update of 10 page paper on customized metadata and model), Jay should use that for comparison
  • Joint meeting with Data Description Team
  • Larry showed flip chart drawings from meeting (example from Citation for OMB Degree of Burden)
  • Questions about EA: GSIM Referential Metadata - same thing but to include the structural information in addition to the
  • Why do we need two levels? We have a report of a blood pressure reading - we have a structure we can report but don't have a means of validating unless we can describe the structure.
  • Could include other standars such as Open EHR by describing it in a Common Metadata Structure
    • Example of blood pressure: Datum (systolic, diastolic) State, Event, Protocol
  • Allows you publish the Jason structure
  • Could allow fuller tie-ins to Concepts etc.
  • Concerns:
    • The connections to the semantics
    • Problem of too wide usage
    • The intention is to only use this in specific places where we know that we cannot know in advance. Allowed only with titled semantics. We need to be very clear about where this is available.
    • Attribute structure in GSIM could subsume all of this under an attribute component
    • Your items in SDMX are the attributes - contentious issue in GSIM
    • We need to be sure it is only used in specific context rather than things that are already there in the DDI model
  • The models would be in the utility package in their abstract form - used only in specific places - so we control the use
  • We may still have two ways of doing this- still have key value pairs but allow the possibility that people would describe their CVs
  • This goes beyond the flat or hierarchical approach to graph-like structures
  • Jay would like to present a way that this is actually being implemented with SPLUNK - begins with no understanding of data (they are streams), then it comes in and constructs key-value pairs, creates context, then structure
  • TO DO: Schedule a joint call with the data description team when we have some sort of consensus - including revised paper, Open EHR, SPLUNK, other?
  • We need a pattern for custom metadata that is consistent
  • Need to feed back our work to GSIM

Review - what's needed from the Modelers perspective for the "read me" document:

  • Hold an on-line meeting with folks who want to do the reviews by walking them through the model - a DDI Town Hall
  • Do we need a UML Tutorial - post it to SRG list for consideration by AG
  • Click through document
  • We may have not done enough cleaning to simplify - are things still too complex? too simple?
  • Are we covering the 60%, 80%, 90%?
  • Reaction to the cascade model of describing variables
  • Under-specification or underdocumentation
  • Inheritance through extension strings needs to be explained!!
  • A table view inherited attributes - so people can see everything that is available
  • Have some examples of views, an Agent view etc.
  • Point out the problem of the "wrapper" - mechanisms for including information about the package - we could have input at the binding level but it would vary by binding
    • A standard set of things that go into every view and then put a container around them like "View Management"


 Virtual meeting 8 April 2015 NO MEETING

Cancelled due to NADDI

 Virtual meeting 1 April 2015

Attendees: Wendy, Dan G. Jon, Larry, Arofan

  • Documentation  (Jon)
    • Critical is the "Why a new version"
    • Need a statement about release approach - separate section "Why isn't it all there?" Larry provided paragraph from other document and will paste it in at the end.
    • The Google Doc is not a "document" but is text to use consistently thoughout the DDI communciations system. Add an opening paragraph
    • Model based / Model Driven MD needs to be cleaned based on decision.
    • This is a short snappy document that the AG can use to hit the main points and use consistently thoughout the system. Send to AG as soon as Jon completes.
    • Arofan's document Work Product (messaging) is good as it focusing on the product line
  • Cardinality check - Flavio
  • Should representations go out as a full package - somebody needs to write up a comment on this package
  • Could a clarification of which classes are being reviewed in the partial packages be added to the "by hand" documentation
  • If you are mucking about in any of the packages identified for review we need to take the snap shot so we know what we have
  • Larry - Because we have separate working groups there are things out there that are not really connected; at some point the modeling group needs to look at how all these things connect. A lot of this will pop up in Q2. It would be good to have the UberView image at this point.
  • Modelers meeting cancelled next week.
 Virtual meeting 25 March 2015

Attendees: Wendy, Flavio, Larry, Jon, Oliver, Jannik, Jay, Dan

UberView

Notes from TC on UberView

  • Should an UberView be published?
    • Any reasonable definition of View would say the entire library is one but is it one we publish? The discussion took place 2 Dagstuhls ago but hasn't arisen lately.
    • It makes sense that there is a requirement for bindings of the entire published library
    • Were there any issues identified for not doing it? Not really...we were just focusing on other things. Theres additional stuff we'd need to do to make this happen but it is doable.
    • Very useful a means of connecting different views and understanding the holistic model
    • Let's people mix and match without creating new views
    • In the current lifecycle we have internal and external references...what are the implications in defining views.
    • ACTION: Send strong recommendation to Modelers that we produce an UberView of the Library Packages, either in Q1 or later if it would cause a delay in terms of generating it out of Drupal XMI
    • ACTION: Send an advisory to AG, Scientific Board, and Executive Board that a technical decision has been made to publish a full Library View "UberView"

Discussion

  • Should be easy to create a View with automatic harvesting 
  • Oliver can do an XML but we don't know about the RDF ("that could be interesting")
  • Best to have it done in Drupal which probably means a Q2
  • Be clear who the audience is for this - we may want to hold on this until we are past the periodic development draft reviews

AG framing document

  • Issues raised by Achim can also be addressed by documentation
  • Not sure how we are addressing this in a coordinated matter
  • How are we going to be using the model - we lack the story of how to review
    • Where to start with the review - model, XML, RDF????
  • How do we set up so you can start at various places
    • Top down documentation from parent document to children 
    • Links to allow traversing from any starting point
    • All starting points bring you back to a parent document - we need this kind of document
    • We don't have any single document like that -
      • Training group is going in that direction
      • Why you want to do
      • What use cases would suit the best
    • What is the Moving Forward Project? What is important to know for the development draft reviews
    • Why we are addressing what we are
    •  Documents need to be separate
  • How to get this though a google document we can jointly edit
    • Make an outline first 
  • For outline 
    • Goals of DDI4
    • Why we're doing it
    • What things in 2 and 3 have needed something
    • DDI Maintenance
    • Use Case
    • Process history 
    • Evolution
    • Talk to Steve regarding integration work (puts together different components)
    • Motivation to move beyond survey to other domains
    • Where to start (model, RDF, XML etc.) - can we profile group or use case
      • In the course of text there are links to other documents
      • At end of each document there is a set of links so you can see related documents in a group (apart from the text)
      • A road map
      • We need these documents available in one place - we need to have relationships and information on target arguments (Jon has a list)
      • How much of this is in the Technical Guide?
      • Right now this is all in a word document and we need a better way to do the cross references
      • Gaps in the Technical document is having the "starting place"
      • We've had a User Guide and Technical Guide
      • Needs to be chunckable so we can reuse portions and allow people to easily move around
  • We need like a 2 page narrative covering what and why we are doing DDI 4 and identify the pain points
    • Jon will take a bash at this for review on

Olof added some objects from Drupal that we wanted in the XMI for documentation purposes

What is going into Q1:

  •  Full Packages
    • Agents
    • Complex Data Types
    • Conceptual
    • Core Process
    • Identification
    • Primitives
    • Utility
    • Collections
    • Representations
    • Discovery
  • Partial Packages
    • Data Capture (2 of 10 - Capture, ResponseDomain)
    • Processing (4 of 6 - Command, CommandFile, Parameter, StructuredCommand)
    • Question regarding NewObjectsForReview (Wendy will look at) REVIEWED - DO NOT include in Q1
    • Question regarding review of Representations - Larry made a list - (Flavio will look at and get back to group) (minimal objects to review: CategorySet, DataType, Designation, Note, NodeSet, ValueDomain, Vocabulary)
  • Views
    • Agent
    • Discovery

Flavio's Notes on Relationships

  • Q1 we need to explain the decision but not necessarily implement
  • Flavio will prepare documentation on the decision
    • what ever is in the model doesn't express direction - all are undirected unless there is an indicator 
    • Information on directional will be added to Drupal (post Q1) 

Drupal will be set after the freeze to just include those to be published (published switch)

 Virtual meeting 18 March 2015

Attendees: Arofan, Wendy, Flavio, Dan G, Oliver, Larry, Jay

Issue of bi-directional and non-directional associations expressed in model

  • Variable point to value domain
  • Dan feels there should be a bidirectional model
  • If the value domain points to the Variable then I have to version the value domain because you are adding a new relationship
  • The bi-directional is captured in implementation (indexing)
  • The problem is that conceptually there shouldn't be pointers but associations that are not represented in either option but there is no way to do this in the drupal because it doesn't have undirected associations. As long as the associations are made in one direction and then the bi-directional can be solved in the binding or implementation (RDF bi-directional, XML indexes)
  • You do express directionality in the semanitics of the relationship and in identifying the source and target
  • As long as the association is a first-class citizen it is
  • Bindings are all able to query backwards on a uni-directional association so the value of expressing the bi-directionality in the model is not nessesary
  • Variable to value domain directionality - Wendy, Oliver,
  • Queries are handled by tools
  • Could put an optional semantic for the reverse - what do we gain with this?
  • If people are looking at the model and see directional associations they will assume they can't go the other way
  • Is it a drupal bug? should it be drawing the return path?
  • Typically people don't put in both directions but
  • In GSIM model very few are directed (they are not on the on-line version but are modeled as directed) -- actually are directional
  • In the model we don't say that relationships are directed but if we need them be directed in the implementation that is done in the binding. Associations are normally bi-directional and you are telling people there is an interpretation in both directions that might be useful. From the point of view of the model and provisions we are trying to convey we don't want to relay uni-directionally in the view of the model.
  • We can document the implementation: where do we capture directionality
  • In represented variale has a relationship takesValueFrom
  • If we could indicate bi-directionality and provide both names and indicate preferred direction
  • Can this be expressed in XMI?
  • Name to association plus a semantic associated with the direction of the relationship
  • If we are only specifying a relationship only once between 2 classes we have to pick a source and a target in order to document it. If there is reason to in the other direction then just because its documented in one direction you can think of it in both (the model should express this).
  • We are misusing the names of the associations if they are being used to imply directionality
  • If you think about just being inside a value domain and looking just what you have there it means you reduce your navigation possibilities. However you are always looking from the top and therefore can look for anything that contains the value domain identification in its class
  • This is an issue that was raised two years ago and would require a major revision of our modeling style - this was made by another group (pre-modelers)
  • We need to table this issue and determin the direction of this particular type of association as a consistent means of representing this.
  • Think that arrow heads may be unexpressed in Drupal due to space(?)
  • If there are techncial reasons for them being expressed this way we need to make it abundantly clear to the reader:
  • Unless specifically stated the associations should be interpreted as bi-directional
  • Source and Target does not imply anything regarding the directionality other than the verb for the naming convention
  • Source/Target drawing seems to be consistent
  • Name is an issue because it is a semantic label we put on the relationship - convention is source describes the relationship
  • ACTION: Flavio will write up documentation on relationship
  • Jay-can we create a programatic solution in the future that for an example represents comes back represents/isRepresentedBy

Olivers issue

  • Is the solution in his email definately needed? Do we see Collection as a pattern rather than a super-class allows easier binding but the pattern is violated frequently. Its possible to be more rigerous using design pattern using UML profile to enforce this computationally.
  • Sent an example article on the use of UML profile to do just that (Jay) "The Design Pattern Instantiation Directed by Concretization and Specialization.pdf"
  • Collection and Process Model are the currently the problem. Could these be placed in a strict package of superclasses.
  • Lets currently use the restriction as described in email - pragmatic way forward

Continue to meet at the designated CET time during DST

 Virtual meeting 4 March 2015

Attendees: Flavio, Wendy, Dan G., Larry, Oliver, Johan    

Agenda: Complete Flavio's question list, follow up on Larry's DDI Report on cardinality

Flavio's Question 9. Why does it have a StandardKeyValuePair? Because it is available in all standard statistical packages on the variable and on the store  

  • The StandardKeyValuePair is available on all VersionableTypes as UserAttributePair understand the need to use a generic in a statistical packages   Shouldn't we deal with known objects   A system specific user defined property of the object expressed as a key/value pair. As this is specific to an individual system the use of controlled vocabularies for the key is strongly recommended.
  • Require the use of a controlled vocabulary...how?
  • Why is it required in SAS or SPSS etc.? All allow for addition of key/value pairs to stored variables.
  • If you are going to use the structure in SAS you should map back to the appropriate DDI object (such as an attribution)
  • The question of whether we should support this "alternative" means of making an extension was raised when we added this to 3.2 (requested by Dan/Jeremy)
  • When you use it for your own system its OK but it should not be expected to be interoperable
  • It should not be considered to be interoperable but SYSTEM SPECIFIC
  • We need to clarify how we are doing extensions.
  • But was intended for a run-time rather than as a formal interoperable extensions
  • Need to be able
  • there is a danger in making this available and having it abused by translators from SAS / STATA etc.
  • Documentation from Note/proprietaryInfo: A set of actions related to the object as described by a set of name-value pairs. This would commonly be used in a case where additional information needs to be recorded regarding the content of a new element or attribute that has not yet been added to the schema, for example when a bug for a missing object has been filed and the user wishes to record the content prior to correction in the schema. Ideally this should be handled by local extensions of the schema as described in Part 2 of the formal documentation. However, the structure in Note allows for an unanticipated need for an extension at run time by providing a means of capturing system specific information in a structured way.
  • At the same time you want to be able describe a data set explicitly, i.e. is it important to be able to document that certain key value pairs were present in an original dataset even if the keys could (should) be mapped to some known structure/vocabulary?
  • Example: Creator key/value SAS map to Annotation/Creator The danger in using a KeyValuePair is that you now have information on Creator in two locations (Creator and StandardKeyValuePair). Could this be done by reference, e.g. documenting that a key value pair existed in the original dataset could be done by reference to some known object (like Creator)?
  • You can only map from SAS to STATA via DDI if you move it into a specific home
  • Mappings are external. When you move data into DDI you should retain mapping.
  • How different is this case from any other where we want to map in and out.
  • Advocating dropping it and adding extension points (Flavio, Dan G., Wendy)
  • As a programmer a KeyValue pair that is system specific gives a place for system specific information. Can we use Note which has a Key/Value Pair.
  • We should put more effort into modeling it in a specific way so that we don't leave it so susceptible to abuse. It should be handled by formal extensions
  • If everyone is doing extensions these need to be followed and introduced into the model.
  • We need to have a means of governance for managing extensions, determining what is included and in what format.
  • Added overhead of doing something that is very prescribed is an issue but if everyone uses key value pairs then you lose interoperability. Use of common attributes described outside of the standard would allow understanding of the meanding and content. The problem with the open key value pair is that you can't predict how it is being used, can't validate. It is more of a one-time thing. Once defined you can use it forever. Common extensions should be brought into the standard and convert to something stable.
  • We need a place to share. Create a common GitHub or some for of repository We want to be sure we are interoperable with our systems. IE we are using BitBucket not GitHub. BitBucket may not be as visible.
  • AGREE: Drop StandardKeyValuePair but don't have a replacement. Some people go off to recommend a means of extension. Raise it as an issue for review.
  • Come back to this next meeting or note it as an issue

Cardinality Review (Larry's DDIReport)

  • Larry has run this for every object in the model as Our relationship names don't work very well in a sentence i.e. metadataQuality and type MetadataQuality Object of the exercise was to list these as English type sentences for easier review Review for clarity of structure in alphabetical order Second version is html so they are linked (nice)
  • Review for next week.
 Virtual meeting 25 Feb 2015

Attendees:  Arofan, Wendy, Jannik, Johan, Dan G. Jay, Flavio, Oliver, Achim

1) Representations:

  • Didn't find a lot of issues
  • Confusion - the way the numbers get rendered on the archs in the model (numbers could be flipped)  
  • When you express the meaning of the relationship the verb is on the source object and cardinality on the target  
  • It appears that the cardinalities are coming up on the reverse side  
  • Black diamond between StatisticalClassification and ClassificationSeries that may not make sense (says a StatisticalClassification can exist without a series) - should stay consistent with Neuchatel (GSIM is white diamond) ACTION change to aggregation (DONE)
  • ACTION ITEM: Review of Source and Target cardinality; write a standard sentence. Create a change log. If one writes sentence and the other reviews. Larry will go through and write sentences.

Description should be programatically generated by the cardinality at some point.

Some associations are not displayed at all  

  • Example: Conceputual concept parent child and concept part whole are not pointing anywhere  
  • ACTION: Flavio write up and make a list of these occurances and send to Jon, Olof, and Johanna      Results in an image in build with 0 bytes:    
  • Error: :50: syntax error near line 50 context: DDI_ConceptParentChild:ConceptParentChildparent -> DDI_Concept [side= >>> arrowhead= <<< none labeldistance=1.9 taillabel="0..n" headlabel="1..1" edgetooltip="parent" fontcolor="black" color="#00000"];   

Are we rendering a 0..n relationship correctly? Is 1..n consistent with an open diamond  

  • We need consistency in how we enter cardinality  
  • You have composition and aggregation. composition has a life cycle dependency. When you have a composition do you allow there to be 0 children? In some situations you will have a collection where the children depend upon the parent (Node / NodeSet) Can you design a NodeSet with 0 nodes? Questionnaire and instrument can an quesionnaire exist without a instrument? A data set with no rows is a good example. Neither aggregation or composition requires a content. The part has to have the container but the container doesn't need parts. Other people look at the style of assigning cardinalities as means of providing consistency. If we look at what people have done you will find inconsistencies in the type of relationship and the cardinality, especially where we are talking about a necessary relationship. ACTION: Jay will review, provide examples and talk with Larry about creating the appropriate language and the direction of relationships. Bring this back to group. We need to determine if this is a real issue or can just be decided on a case-by-case basis. This could feed into the way Drupal renders drawings.  

2) Flavio's question list 7-9

  • Q7 - Value Domains point to the variables and not the other way round.
    • TABLE discussion: Bi-directional relationship - even within RDF you have directionality - you can name it independent of the direction We need to be consistent - if unidirectional we need to determine direction
    • We have a two-way relationship militating against the direction nature of RDF and XML implementation.
  • Q8 - DataType and InstanceVariable  See comments in conceptual plus DataType has no definition and a property scheme of type international string. There is a non-understanding of what datatype is doing. Its a computation type. 
  • XML Schema modeling issues: Ambiguous type definitions in two places - Collections and Representations

NEXT WEEKS AGENDA: Tabled discussion on bi-directional relationships (see above)

TC will put Versioning paper on agenda

AGENDA March 11 - Versioning

 Virtual meeting 18 February 2015

Attendees: Flavio, Arofan, Wendy, Jay, Johan

1) Representations: Larry, Jay and Dan will meet later today

2) Flavio's use case on Concept system, classification

  •  Did concept system as a collection and the XML was horrendous

3) UML does not support specialization and add constraints and some other language

  •  choose to ignore the existence of members and add as part of extension
  •  The good news is knowing that concepts are members
  •  the bad news is that you lose the idea of members
  •  You shouldn't let the tail wag the dog - "We should just bite the bullet and stop complaining"
  •  It works in RDF but that doesn't care about validation
  •  The XML validation (limit by documentation) is whats missing here
  •  When Ornulf was defining the structure of data points using JSON - it was the same thing we did with sets of ordered relationships
  •  Treating a data store as a collection of data points that has a structure
  •  Where does this get us practically
  •  The main requirement was dealing with the reusability of Members
  •  We could separate them and say a concept system works the same as an XCollection etc.
  •  We are up against the wall here because we can't get it to work with schema
  •  If we had to make a decision now it would be with a design pattern
  •  Restrict in terms of documentation and say we're working on the restriction issue
  •  We're going to have problems with any implementation model in various places

 Design Principle:

 We should leave as much richness in the model (i.e. restrictions) even though some implementations can't express it. Depending on the implementation technology you will have different problems. Instead of working to the lowest common denominator you design the model correctly and deal with the specific implementation problems in the binding.

 Example: For XML schema you implement a collection as a pattern. For RDF you treat as an inheritence structure.

  • We will have to come back to this issue and find out what this looks like in the bindings
  • Oliver and Achim are hammering on the XML schema - it will changes things but not the model
  •  If the model is right we can move forward and deal with the specific implementation bindings
  •  If you take the principle that you are going to ignore the abstract and do the concrete stuff in XML. The abstract stuff still constrains the concrete stuff to certain sets of things and therefore is more consistent.
  •  What I don't like about it is that most of the time XML manages the abstract stuff...just not in this instance.


4) DrupalReview-Feb1.docx (From Flavio)

 1. Shouldn’t ClassificationFamily have a relationship to ClassificationIndexes rather than ClassificationIndexEntries? Having a direct relationship to ClassificationIndexEntry seems wrong. CHANGE
 
 2. What’s the difference between 'isBasedOn' and 'predecessor' in StatisticalClassification? isBasedOn was never documented in GSIM, and the current definition seems very similar to predecessor. REMOVE is based on and retain variantOf
 
 3. How do we deal with abbreviations? (e.g. in Vocabulary) Are they a type of label or localId or something else? We should have common attributes. CREATE the documentation that abbreviations are always international string (aka label etc.) This should go in communications document
 
 4. How do we deal with abbreviations? (e.g. in Vocabulary) Are they a type of label or localId or something else?
 COMMENT: What are the rules for review? When a URN is a possibility it should be URI. If only a URL then URL. Document in communications document.
 
 5. Type in LevelParentChild should be set total (to specify the constraint that the level hierarchy is linear). Do we write it in the definition (as it is now) or create the property again and assign the constant total as initial value? no longer an issue??
 
 6. In CorrespondenceTable we have a date property to indicate the date of validity of the correspondence. Do we want to make it into an 'interval of validity' for a more generic time travel support? This is related to a broader question about how we want to support time travel in Collections and Correspondences in general. CONSISTANCY for valid dates and use of Date should use "effectivePeriod". We should be consistent everywhere so it also needs to into communications document.
 
 Do 7-9 offline. 

 Virtual meeting 11 Feb 2015

Attendees: Arofan, Wendy, Johan, Oliver, Dan, Flavio, Jay, Achim. Larry

XSLT schema production work: Oliver can work on this next week; make sure everything is OK; Achim will send Oliver list

1) Documentation XHTML tags  

  • Achim's example would have embedded tags to support references to internal documents and examples  
  • Both XML Schema and OWL would provide implementation specific examples  
  • XSDDoc is what we would use
  • Understands that click though documentation  
  • Difficult to read with extra tags - so have 2 versions (one with extra tags to create click-through and another with tags stripped our for easier reading of schema)  
  • Do the tools creating OWL support embedded XHTML tags?  
  • Both require a merging XSLT to integrate by unique object name  
  • Conceptually the right approach would be to use the DocBook  
  • This could possibly be realized in Drupal so we don't edit the DocBook by hand - how much do we currently do by hand  
  • Difference between actual hand work and a consistent known process  

ACTION: Achim will continue to look at this focusing on OWL and how a merge can be done as a second step (schemas created and documentation merged in)  

ACTION: Achim will come back with a complete proposal

2) Object review  

  • Capture and Response Domain: Is the use of Capture as the type for isPopulatedBy in InstanceVariable?  
  • Publish and not the fact that dependencies need to be reviewed (we know what should go together but not how they interact with objects in other packages)  
  • List of 10 or more outstanding questions - representations - DrupalReview-Feb1.docx  
  • Dan, Jay, Larry will review at NSF meeting next Wednesday afternoon  

ACTION: Go over questions in Flavio's list next meeting  

3) Universe, Population, Unit  

  • Entered as requested  
  • When camel casing a non-hyphenated word don't upper case the second part

4) Process  

  • Ran into the specialization issue  
  • Added ways to manage sequence of things temporally using calculus  
  • Reviewed with protocol for containing sepsis in hospitals and it seems to work but still dependent upon specialization resolution  

5) Specialization  

  • This would not be possible in Java but there is a way to do code generation from XMI and you can do it in XML schema but its a bit ugly  
  • Flavio's email nailed it in his email (added to 2015-02-04 minutes)  
  • Leave the new name, document that it is a specialization of class X in package Y  
  • We don't have the distinction between specialization types additive extension and restriction which we don't have   
    • Temporal relations - fuzzy predecessor and fuzzy successor; contains and during are more specific (additive)  
    • Vocabulary can contain concepts because it inherits from a concept system but can also contain classification items but you only want it to contain one or the other   
    • Could SOME of this be solved by careful modeling  
  • Collection is still a problem and so cannot be solved by this  
  • We need to have examples ACTION ITEMS:   
    • Send schema examples to Oliver (Arofan)   
    • ConceptSytems and Concept (Oliver will do for overriding example)   
    • Collections review specialization of relationships (Flavio)   
    • If someone comes up with a solution Jay can test it a little (example of making a data store as a collection)  
  • There is overlaps between fuzzy predecessor and more specific ones and they can co-exist; contains overrides fuzzy predecessor or fuzzy successor  
  • We need to look at examples and come back to this - we need to focus on the review stuff
 Virtual meeting 4 Feb 2015

Attendees: Arofan, Wendy, Larry, Johan, Oliver, Dan

Olof is all out of time in February so won't be able to do anything to Drupal until March Requests should also go in bug tracking for Drupal  https://github.com/ddialliance/lion

Requirements for specialization  (summarized by wlt)

  • Dan: Understands the problem. Concerned about what is driving the need. We modeled a concept system as a kind of collection so doesn't say that a concept system needs to be specialized to concepts only. What that says is the underlying model is not right. That question doesn't seem to be on the table at all. Want to get the subject matter right and then get the structure right.  
  • Wendy: how do you make a consistent structure that requires a specific object?  
  • Dan: Why can't the relationship of member to collection sub-type the relationship.  
  • Larry: There is a contains relationship from Collection to Member and from concept system to Concept  
  • Dan: Can the relationship be specified as a subtype which would be more precise but will our use of UML support this. If we rely on names we could make a mistake.  
  • Larry: Makes sense to be more exact but don't know what tools support  
  • Dan: Possible EA doesn't support. Drupel has to recognize it and get it into the xmi correctly.  
  • Johan: Not an xmi expert,  
  • Oliver: Does this overload an existing variable and how it would work. Don't think this is doable in the UML. The property is just the name of the line. I don't think the relationship can be expressed. Without a having a final answer I don't think is possbile in UML and EA. Can rely on name but can't make it explicit. What change is required in the xsd. Should be able to overload in XML by relying on names. Don't think there is another way to make this explicit.  
  • Dan: So the namespacing of the superclass is part of the sub-class and if there is a conflict then the child definition is what is used?  
  • Oliver: Worried that you don't override the getter and setter methods. Will check on this.   List of changes from Wendy and Flavio  Good to have a list of include and build items generated

Changes for Agent  

  • Johan: Cardinality change on IndividualName is in drawing but not list of changes  
  • Wendy: Action - make changes

Look at the list of Required objects and make a recommendation for action:  

  • Additional option: Create a temporary package to house those objects required for current publication - solves the problem for this release  
  • ACTION: review objects  

Discovery View doesn't have a "thing" to view or discovery  

  • This is a very limited view and needs text accompanying it to clarify that it is VERY basic and essentially a cataloging record at this point. It will be expanded.

ADDED NOTES FROM EMAIL FOLLOWING CALL

Hi Folks,
 I went over the minutes and just wanted to clarify a point about the    specialization issue. I don't know what was discussed this morning    exactly, so if I am completely off the mark I apologize. 

A Concept System is a Collection, but it's not just    a Collection. Therefore, its Concepts are Members,    but not just Members. As a consequence, the     contains relationship in Concept System needs to be    specialized (restricted) to the subclass of Members that    are relevant to the Concept System, that is Concepts.    Conceptually, that is what we want to say in our model; as long as    we say it, the model is fine.

The problem is that the language we want to use to say it (UML)    doesn't support specializations of relationships. We had a similar    problem with the former order relation: we couldn't say that    parent-child in NodeSet was a specialization of    predecessor-successor in Collection, which forced us to reify     the relationship into a new class: OrderRelation. That    is a well-known workaround to specialize relationships in UML, i.e.    making them into classes. We don't need to do that in other    languages, like RDFS, which supports in fact specialization of    relationships (in fact, in RDFS they are first-class citizens).

Now, we have two cases of specialization: (1) of relationships with    the same name, (2) of relationships with different names. Arofan's    document addresses (1) by means of overriding the same name    relationship with the a new target class. For (2) we don't have a    solution yet since overriding works only with relationships with the    same name. For instance, in our UML model, the parent relationship    in NodeParentChild is formally not a specialization of predecessor     in OrderRelation, regardless of what the description    says -- it's just a new relationship that stands side by side with     predecessor inherited from OrderRelation. In other    words, NodeParentChild has currently four relationships: parent,    child, predecessor and successor. There is no way in UML to formally    say that parent is a specialization of predecessor.     

If we cannot live with that issue, then I believe there are only    four solutions, three within UML and one outside of it.

 Within UML:    

(a) Get rid of generic Collections, Correspondences and any other    class that requires specialization of relationships. Drastic but    effective.    

 (b) Get rid of the relationships with different names in the    specialized classes and use Arofan's solution. Relationship names    are not going to be as meaningful as they are today, since parent     will be replaced by predecessor, etc. But correct.    

(c) Apply reification again and make the relationship into classes,    e.g. ParentClass, ChildClass, etc. and use the usual    specialization mechanism for UML classes. Cumbersome, ugly, but    correct.

Outside UML:    

(d) Use a constraint language to say what we cannot say in UML, e.g.    OCL. Cumbersome, but correct... and already discarded.

Now, please tell me if all this was already discussed and I will  stop babbling :)

Cheers,     Flavio

PS: to answer your question Larry, I think your diagram is correct. (referencing Larry's diagram added to Arofan's paper)

 Virtual meeting 21 Jan 2015

Attendees: Arofan, Wendy, Flavio, Larry, Dan, Jay, Johan, Jon, Jannik

(1)    Jay’s re-modeling in the process area

  • Jay will circulate a draft for discussion

(2)    Contact info cardinality   

  • Relation items are not really reusable - should be properties   
  • Cardinality of web-site 0..n no validity statement or effective period   
  • Effective period add to web-site   
  • Contact Information - cardinality 0..1   
  • properties should be consistent    
  • type of location is part of address          
  •  Looking at properties of Contact information for consistency   
  • Moving regional type of location and location name onto address   
  • Move privacy attribute onto all the properties  
  •  IPU international standard - component point of view ISO 19773 turned it into a model     
  •  move ResearchID to Agent AgentID - expand documentation (public identifier for an organization, individual, machine) provide examples         

ACTION: Wendy will look at ISO 19773, revise and provide and provide a model for approval by the group  

(3)    Achim-related points which came up last week (XMI export)

  • Moved to next week 

(4)    Universe, Unit, Population, etc. (review of that part of Drupal)  

  • No one really right answer  
  • Population a subclass of universe; universe a subclass of unit type  
  • Instance variable was linked to Universe because of common parlance - they can be reused over time unlike Population which is a specifc time/place  
  • However if Instance Variable is an instanciation then Population is OK  
  • We have to recognize this is a bit different than how people talk about variables  
  • Move represented variable down to Universe would make the most sense  
  • Universe is a specialization of a Unit Type not a conceptualization  
  • Unit Type is the most general and can be repeated at the Universe but at application they are specialized in some way  
  • Unit Type should be linked to Conceptual Variable  
  • Problem of applying the idea of Sex to both Bears and People  
  • Change relationship type between Unit Type and Unit to instance  
  • Unit to Population contains or instantiates Unit should just go to population as a sub-type of Unit Type   
    • Having a direct relationship between Unit and Unit Type helps in computation  
    •  Would we then need the relationship between Unit and Population because it would be inherited   
    • Any given population contains units - could have a conflict between Population and Unit Type with both relationships   
    • Fuzzy about need for Unit to go to Unit Type - if the Unit is related to a population then you know the unit type through inheritance   
    • If you wanted to go the other way - all units associated with a unit type   
    • Could leave off relationship between Unit Type and Conceptual Variable   
    • What we haven't modeled is that conceptual variable has to be a use of a concept so that the Unit Type may be identifiable from above   
    • In GSIM a UnitType is the use of a concept  
  • UnitType should be extended from Concept - adds semantic that this is a use of concept as a unit type its the role (remove other content from Unit Type and change to extention base Concert)  
  • Rename relation type - Universe should be a subclass of UnitType  
  • Unit to UnitType instantiates  
  • Relationship from Unit to Unit Type should be an aggregation the and an Open Diamond  
  • ABS example: Data base of people can they have more than one unit type? Unit Type should be general enough that a Unit should apply to one and only one unit type  
  • Unit Types need to be mutally exclusive ideas; need examples - Problem is concepts don't obey strict mathematical laws. accept the idea that sometimes a similar idea may be listed as a Unit Type and in another context it is a Universe. Concepts can take multiple roles except maybe in a particular usage. Roles revolve. We need to explain what we mean by designating something a unit type. there is a concept of a community of practice, different sets of specializations and different types of generalizations. if A unit type is a use of concept you can have hierarchies of concept so you can express this in Concepts. these have similar concepts which are part of a concept system. Should be a constraint. Unit Type is a specialization of a Concept which can belong to multiple systems. How can you tell what concept system is involved.

ACTION: Flavio will revise model based on this discussion and send to group  

 Virtual meeting 14 Jan 2015

Attendees:  Arofan, Flavio, Jay, Johan, Larry, Oliver, Jannik, Wendy

  1. Communications document:  
    1. Add:  
      1. correspondence description/mapping - one page high level (Flavio)  
      2. include any relevant pointers (next 1-2 weeks)  
      3. process model (Jay)  
    2. Intended to be a primer for the business modelers 
    3. Would like to get a draft out for people to comment on
  2. Annotation  
    1. change extension base to Identifiable and add relationship to Annotation from AnnotatedIdentifiable (Wendy)
  3. Review of complex data types (Oliver)  
    1. Representation reference type - reference type from simple codebook package (Wendy)   
      1. Needs to be moved to complex data types   
      2. Move ReperesentationReference to dead 3.2 stuff  
    2. URL object in complex - change to privateimage and remove URL privacy objects off (Wendy)  
  4. Where is the contact information privacy - and where is the time stamp and reuse  
    1. review cardinality of effective dates and instant messaging  
  5. Conceptual  
    1. Category - use of the concept as a category it is totally a semantic  
    2. Johan put the comment in Drupal and we will raise it in the review  
    3. Hesitant to get rid of this well known object rather than use it  
  6. Review impact of population, universe, etc. on Conceptual, Represented, and Instance Variable (Flavio) (next few weeks)  
    1. Inheritance is used in all objects - repetition of label definition in inheritance  
  7. Sequence (Jay)  
    1. Ordered relationship is currently a predecessor/successor relationship  
    2. Extended concept to non-linear ordering of relationship  
    3. Sequence order based on standard Allan's interval algebra with 7 reciprocal relations (see SequenceOrderRelation)   
      1. Could do a simple one with precede/succeed and equal   
      2. Extend for other Allan terms - should not be an extension of the order relationship   
      3. Create a separate TemporalRelationship   
      4. Generally agreed that the view of a process as a collection is not important here   
      5. Separate objects as pairs (review and subclass in a more meaningful way)   
      6. Jay will revise and then talk to Flavio   
  8. How are we going to finish production framework stuff  
 Virtual meeting 7 Jan 2015

Attendees: Arofan, Jay, Larry, Jannik, Oliver, Johan

Decision points from today’s meeting:

  1. Jay to send comments on Communications document, along with proposal concerning the Sequence, esp. around multiple inheritance. Aiming to finalize communications document in two weeks.
  2. We will get Flavio’s input regarding any needed changes to model/Comms doc.
  3. Larry to send proposed changes regarding Annotation object and degree of contribution (or using Wendy’s phrase for this: extent) – refinements for Q1 review, based on Wendy’s work in London and subsequent discussion
  4. Issue with one of the complex data types - Oliver to figure out what this others to review Drupal and see if there are any messy areas which need focused work before the Q1 review
  5. Change time to 15:00 CET from earlier 13:00 CET as regular meeting time
  6. Find out how we get Drupal documentation injected into the XMI, so that is useful in the EA file - have Achim check this.
  7. Could we export DocBook? We need to discuss this further (probably not for the Q1 review).
 Virtual meeting 17 Dec
 

Attendees: Arofan, Wendy, Achim, Larry

AGENDA OF TO-DO ITEMS - Please address what you can over the holidays and make comments as noted

(1)    Review Jay’s process model proposal based on this week’s discussion - (this document will be coming over the next few weeks, keep an eye out for it in the sandbox. Use the comment section of the sandbox and note the name of the document in the first line of your comment)

(2)    Look at the Conceptual package in Drupal, especially with an eye towards nodeset/classification and value domain-related objects - add comments to Drupal

(3)    Look at the Representations package regarding the use of Collections in Nodeset - add comments to Drupal

(4)    Look at the cluster of Universe-Unit-UnitType-Population - add comments to specific objects in Drupal

(5)    Review Complex Data Types - put in comments in Drupal  

  • there are rules for content attribute - is it done consistently  
  • are there any remaining XML specific constructs  
  • is everything documented

(6)    Any of Achim’s issues based on his research into XMI and how Drupal is working - Issues in Jira document (summary below)

  • Starting middle of January - None of these are show stoppers
  • Not a show stopper if these issues are not resolved before release. Would like to address as many as possible.
  • Owl is automatically generated in nightly build - how to make the owl representation - mapping to other vocabularies is not in the output
  • Need to find a good way to configure owl specific things and then relate in the script Integrate the documentation - able to enter basic html text in the documentation field of XML
  • DocBook should be merged into generation of XML Schema and Owl representations as opposed to doing this directly through XMI

(7)    Review of the work on implementing the Process Model (document in the Sandbox)   Using the DDI 4 Process Model to Describe Historical and Prescriptive Processes_0_1.docx

(8)    Review of the Communications document (add comments to the sandbox page referencing communications document in first line)

To Do List

Enter PairedCodeValueType and change property name to extent - done

Kelly - change meeting time of Modelling team to 13:00-14:00 - done

 Virtual meeting 10 December

Attendees: Arofan, Wendy, Larry, Jay, Achim

Question regarding AgentAssociation

  • AgentAssociation - creator and contributor
  • Role (currently a CV type) needs to have an attribute degreeOfContribution (CV) they need to be bundled -- see credit taxonomy group
  • can you have a degree without a role
  • a structure - containing 2 CV types
  • Wendy will model solution in sandbox

Note there is now a link from the modelers page to issues in JIRA (see bottom of Modelers page)

Review Communications document:

  • Note former Name-Label-Description decisions
  • ComplexDataType - review for CV issue (expressivity of xmi) you cannot extend a primitive - there is a rule but its not expressed anywhere
  • Look at Using the DDI 4 Process Model to Describe Historical andPrescriptive Processes_0_1.docx in sandbox
    •  core process model with extensions  
    • historical  
    • prescriptive (GSIM like)  
    • use case of library structure  
    • requirements for describing provenance - showing how process intersects with agents at a time  
    • names are temporary (provenance and process design)  
    • this was a modeling exercise and needs to be reviewed by someone with more content knowledge  
  • New conventions were adopted but there is no documentation on what happened and why - this is the intent of the communications document
  • Flavio is updating with collections stuff
  • This document needs to be more visible

Issues from Sprint

  • review of process package
    • There is a nested property inside control constructs to chain processing ElseIf  
    • Review IfThen  
    • Remove hasElseIf  
    • Jay will put together changes in sandbox for next week to review (Switch)
  • Classification - node sets - implementation of collection  
    • Universe to unit  
    • Instance variable  
    • Representations  
  • Simple data description now has conceptual and physical   
  • Seem to be a lot of unnecessary plumbing  
    • Universe  
    • Classification  
    • Described and enumerated domains - should maybe remain abstract (do they need to be modeled creating 2 ways of doing the same thing) - is there a value in having these as concrete values  
    • More tractable to connect variables at different levels (more tractable in the machine sense)  
  • Complex data types  
    • A lot of clean-up was done and these need to be reviewed  
    • Need to look at for unintended results  

Agenda for next week:

  • Review of process package
  • Changes in conceptual  
    • Classification  
    • Universe  
    • Collection  
    • Abstraction of value domain construct
  • ComplexDataType changes - implications 


 Virtual meeting 19 Nov

Attendees: Wendy, Therese, Larry, Flavio, Jay

Agenda:

  • Administrative objects (tabled for lack of Achim)
  • Olof's issues list from Drupal
  • Collection
  • Process issues - game plan - issues for Epic 3 in London Sprint

Drupal List:

  • Larry will go through Discovery related issues to clean up
  • Flavio will go through Conceptual
  • Where we are uncertain (I.e. cardinality) base on 3.2 or best guess and note uncertainty in comment
  • Priority is with finalizing those packages/views going out in January/February

Communications Issues:

  • Need to create a communications link back to content groups regarding review of changes etc. In theory this goes through the modeler but many times these are raised in meetings that modelers can't attend. Need a record.
  • Relaying decisions regarding doing things a single way.
    • Need to tag those objects that have been rejected and point to appropriate content
    • Need to document in terms of "I need to create a set....reference to objects and their usage"

Collection:

  • Flavio will enter changes in Drupal
  • Need to generate examples in addition to Classification for collection usage (simple like a set of OtherMaterial, complex provenance of a resource)

Process:

  • Created a sandbox page on Modelers page to capture ideas currently floating around between individuals on email
    • Clarifies that these are discussion pieces rather than specific proposals
    • Jay will put a number of his papers here
  • Need something similar for playing around with model
    • We have the status tags in Drupal but need the equivalent of a sandbox
    • Options:
      • Sandbox area in Drupal
      • Lucidchart (integrated with JIRA etc.)
      • Gliffy (integrated with JIRA etc.)
      • Put on Epic list for Sprint

Epic 3 has been updated with process items and To Do's

 Virtual meeting 12 Nov

Attendees: Arofan, Larry, Jay, Therese, Wendy

Agenda:

  • Library objects
  • Collection model
  • Process model
  • Discovery
  • Leave administrative objects until next week when Achim is available

Library objects  

  • Jay has a deck of library structure, would like to discuss with Wendy and Arofan first  
  • Fixing Drupal display in view - Johanna is working on   

Collection  

  • Should there be something in the base structure that describes likeness of members - homogenity issue  
  • Uniqueness indentifier  
  • The addition of the classification map is an extension which tightens down the cardinality - specialization  
  • Not use it except where it makes sense to use - need to provide advice to content modelers about use  
  • Collections and View are different things  
  • Need basic rules for application  
  • Communications Document for content Modelers needs to be updated to cover 

Processing model - In/Out Parameters

  • Jay will finalize and sent out  

Discovery

  • View and package doesn't match  
  • All the Disco stuff - needs to be rendered and mapped  
  • Are they properties or extended primitives - can they be attached to an RDF
  •  isProperty is an extended primitive  
  • Create a Citation objects and extensions  
  • Modeling of objects in Discovery - Jay will work on this with review assistance from Larry  
  • There are number of different things going on -   
    • Disco Discovery View   
    • There is some discovery stuff like Coverage which has been semi-modeled   
    • New objects for Discovery which were all the things in Disco mapped to RDF (Toronto)   
    • Discussion around Citation and how these things relate   
    • Discussion in Dagstuhl about NIH data discovery index - a type of "citation" - distinction based on intended use    
      • Intellectual ownership    
      • Use of material (sub-sets of a data set etc) - linking back   
    • Distinction between a reference of some sort and not trying to model the actual objects like Disco did (Disco used existing objects in DDI-C and DDI-L)  
    •  Jay will write up proposal on what can be created as a view for Package 1 (we need to get buy-in from Advisory Group)      

ACTION ITEMS

Questions for Drupal - Wendy  DONE

  • New views are not showing up under View - we can find them through the search
  • Ask about alternate cardinalities 2..n, 4..n (points in a polygon)
  • How can we choose which packages and views to publish in terms of the build

Add issues page to Modeling Team page for content people - Therese  DONE

  • Send announcement to teams - Wendy DONE

Communications Document for content Modelers 

  • extend to cover identification and name, label, description piece - Arofan

Agenda next week:  

  • Administrative  
  • Who will do the spade work  
  • Flavio - any additional issues  
  • Library organization proposal if ready
 Virtual 5 Nov

AGENDA:

  • Action list on modelers page
  • Library structure
  • Items from TC proposal to date

Summary of Actions Requested

Library Structure

How do we get an organizational structure for objects

Criteria:

Organization within Drupal that is good for maintenance and good for deliver of the entire example: Process will be used in a number of views

  • Items that will be used as a set of objects that get applied as generic structures
  • Different versions of objects are organized together
  • Dependencies became apparent resulting in grouping - variable stuff
  • Sets of items that act as interfaces between other sets of objects
  • Jay will do a high level draft ACTION ITEM

Identification:

  • How does this manifest in the model? As 3 items or single string (parsing costs)?
  • Full identifier is necessary (full only or full with optional or full both ways) - model vs. implementations
    • individual in model but just URN in implementation  
    • URN only for XML  
    • ACTION item RDF strategy decision PARK and  
    • Document discussion of John Kuntz and Jay (check with Ben Zaplilko (Oliver) and Thomas (Achim))  

Properties and Relationships

  • There were extended primitives being created by different people in different groups
  • These need to be in a single place in the library
  • Would it help to have a package of "proposed extended primitives" - YES Decision  
    • Step one - find and move to "proposed extended primitives" - Jay  
    • Extended primitives cannot have an identifier  
    • Can we generate a list of the users of each extended primitive? also Objects? - Wendy bring to Jon, Olof, Johanna  

Managed objects  

  • Difficult to specify so provide a full set of the optional objects and allow the implementor to determine what to use  
  • Create an object with core properties (Identification, administrative) then extensions with various sets of the NLD properties  
  • Inheritance of administrative information (dependent upon grouping)  
  • "what is administrative in the sense of causing versioning" - need to review   
  • ACTION - all review administrative objects, raise questions for next week  

Get rid of Name use a local ID (currently userID)  

  • Change userID to localID - does this cause a version change

Label - refine definition to remove    

  • Designation or label - Not a catagory and Not a code Use the Name Label and clarify this is not a code or category, it is a display object

Description - same set of attribute as a label make it repeatable

Definition as proposed

"Make it so"

Decisions:

  • Identification structure approved as proposed (decision pending on RDF implementation strategy PARKED for additional information)
  • Create a "Proposed Extended Primitives" package in Drupal where content modelers can create new extended primitives to be reviewed by modelers and then moved to Extended Primitives
  • Separately raise issue of "what is administrative in the sense of causing versioning" - agenda item for next week Do not differentiate truely managed objects within objects. All objects will have the available properties, individuals determine management
  • Drop Name (Locator) and change UserID to LocalID with the Definition and Usage propsed for "Locator"
  • Retain Label changing definition to remove reference to a category or code. It is a display object
  • Description gains the same set of attributes as a Label and becomes repeatable
  • Accept Definition as proposed

Action items:

  • High level draft of library structure (Jay)
  • Obtain input on RDF implementation of identifier from Benjamin Zapilko (Oliver) and Thomas Bosch (Achim)
  • Identify proposed extended primitives in individual packages and move to new location (Jay)
  • Request a means of generating a listing of where extended primitives and objects in general are used, bring to Jon, Olof, and Johanna (Wendy)
  • Review recommendations on administrative objects as available properties on all objects, raise questions for decision 2014-11-12 (ALL)
 
 Dagstuhl meeting, 20 October

Attending: Arofan, Jay, Dan, Barry, Jeremy, Wendy, Achim, Jon, Larry, Kelly

Prioritized action items:

  1. Classification issue support - Classification team members present in Dagstuhl will meet on Tuesday at 7pm to discuss Dan's use case (he has Hilda's input already).
  2. Other conceptual changes - Conceptual team members present in Dagstuhl will meet on Tuesday at 7pm to discuss changes needed to create stable core.
  3. Rules for modelling need to be documented - Barry to help with creating a communications document.
  4. Documenting use of process model for the content teams - Jay to discuss in evening session later in the week.
  5. ID-Versionable-maintainable issue - Give the position statement created for TIC by Wendy to John Kunze to review and get his feedback. Arofan and Achim to follow up.
  6. Namespaces - Jon will explore issues resulting from connecting to points of contact in other namespaces.
  7. Organization of the library - can be done virtually, though ought to be considered in Dagstuhl if time available.

Other topics:

  • Defining Collections - Flavio's work on collections is with TIC right now. Will be evaluated after their review.
  • Be more proactive with the content modelers while in Dagstuhl.



 Virtual meeting 1 Oct


Attending: Arofan, Wendy, Larry, Jannik, Jay, Flavio

Update on Classification

  • Classification has finished the analysis of what to put in the view - including new objects
  • Need to change some associations and will send out for next weeks meeting
  • Conceptual libary includes the classification objects - the March release is the classification view
  • Good to have this together before Dagstuhl
  • Jay can send out a view of what they are looking at for variable and will send out

RDF mappings in Drupal

  • Is the RDF mapping piece enough
  • We need instructions
  • Need to be able to map properties to RDF
  • When we create this RDF syntax we need to make sure it is correct and consistent
  • We need to create instructions
  • We need to review the content and consistency in the RDF bindings
  • We need clearer direction (Jon is pulling together what exists elsewhere as a base to work on)


Discovery

  • Discovery AND NewObjectsForDiscovery need to be merged and have prefixes removed - Jay
  •  Foreign objects are RDF mapping at the object level
  •  Need own native objects which need to be promoted from 3.2
  •  Use the assumption that is owl:sameAs and then review
  •  Keep process notes

Modeling questions raised in Simple Instrument:

1 - if we are going to do one thing only - there should only be one way to do any one thing
2 - grouping will be available in the next 2 weeks
3 - duplication should be handled by the modelers
4 - statements in questionnaire

Statements, Instructions, Related materials are these the same or different and how they fit into a process model and possibly question


next week is Agent

 Virtual meeting 24 Sept

Attendees: Arofan, Wendy, Jay, Oliver

Put on to do list:

Review papers from Toronto and determine the organization of the packages in the library
Toronto - Library organization
http://www1.unece.org/stat/platform/display/DDI4/Packages
http://www1.unece.org/stat/platform/display/DDI4/Files+shared+among+team+at+Sprint
http://www1.unece.org/stat/platform/display/DDI4/Structure+of+DDI+4
http://www1.unece.org/stat/platform/display/DDI4/Vancouver+Sprint


Order of work at Dagstuhl is to define the structure.

  • Library packages - what should these be / directions for content people
  • Views - creation
  • Have to define what they need and put it in a package of "New Objects for Review" rather than a package of their "own"
  • In terms of cleaning up Drupal we need to come up with a basic plan during Mannheim


Conceptual

  • Simple data description group has been working on different classes of variables and its pretty far along
  • Includes conceptual variable, represented variable and instance variable
  • It changes the simple data description a bunch
  • A new type of instance variable was created that had additional components that it didn't have previously and goes beyond what is in GSIM
  • Will be finished in a few weeks and this could then be taken back into conceptual and simple data description


Agents

  • Term Machine - is this an appropriate name
    •  I have a machine which is an agent but no identifying information for contacting or finding a machine (means of identifying specific machines or software) such as a URL for a program
  •  What do we need to identify this
  • Larry has a proposal going around on Agent because of citation - unfailing identification - citation of a machine
  • Where does the name property live (agent or substitution type)
  • Simple process uses agent - one of the things Larry wants to do is be able to describe processing pipelines and he needs the Agent from the Agent package - see binding usage and 3.2 reference that supports binding at the point of reference
  • Clean up "type"
  • Point - an example is this more than we need; is it more
  • Contact information - electronic communication with a type - can be simplified - internal controlled vocabulary
  • emailCodeType - Underdocumented what are you suggesting
  • individualNameType - is this overkill?
  • Instant Messaging -
  • internetEmail is a restriction on the email format - if you use a restriction is it implementation specific and need to be documented in a conceptual way
  • SexSpecification why an object and not a property
  • Telephone - standard communications thing
  • AdditionalInformation - why???
  • Do we need additional extended primitives (e.g. DatedStructuredString or DatedInternationalString
  • Relation - clean up relationship types to reflect current types
  • Keyword should be codevaluetype should be a property - isChild and specific relationship
  • Individual - same issue of inherited Agent name
  • resercher ID should be time bound and typed
  • Do we want to track project relationships to researchers - its a qeustion not within the agent package but within the overall model - this is a question of scope

Arofan will send out message regarding weekly meetings and try to get agendas out earlier to group


 Virtual meeting 17 Sept

Attendees: Arofan (chair), Wendy, Jannik, Oliver, Larry, Jay

Agenda:
(1)    Identify which packages can feasibly be released in the first review  (Mary Vardigan has asked for this)
 Stay with original set for Review 1: Conceptual, Discovery, Agent, Core, Process
(2)    An overview of the contents of Drupal for those packages which are supposed to be stable, with the idea to generate a list of issues for each package for subsequent work – this will include all definitions, properties, relationships, etc

Status of each packages for Review 1:

  • Conceptual - walk through in today's meeting
  • Discovery - needs to clean up foreign name spaces - modelers need to clean up
  • Agent - seems to be quite ready
  • Core - needs a final walkthrough
  • Process - need to review

Question to Jon - whats been added drupal make sure there are instructions for their use?

TO DO LIST:
Review materials from TC regarding basic extension types including identifiable and managed objects, collections, administrative data, etc.


General work to do:

  • clean up all foreign namespaces, particulary in Discovery (task for London Sprint)
  • package namespace and consolidation (NewObjectsFor...)
  • resolve instance variable - currently Jan and Dan G. are working on this
  • clear up duplication on the edges/interfaces between views and packages
  • package review - are things in the right place?
  • is there a mechanism for filtering out incomplete or not-yet-ready-for-prime-time packages
  • What is an attribute, what is an object, what is a mapping?
  • What is the style we are really going to use? That needs to be documented
  • The word xxxType in an object name means we need review it
  • Finish off the work on material sent from TC
  • Discuss CodeValueType are those in binding or not


Walkthrough of Conceptual Package: Issues

  • Concept system - improve the definition; only contains a definition
  • Designation - *
  • Sign - value is structured string? see 3.2 ValueType
  • Vocabulary - do we need this?
  • AuthorizationSource - what is the relationship to Agent? what is the definition of this? does this need to change? ok?
  • Level - relationship to generic collection
  • Representation - use of CodeValueType
  • Category - review subcatagory in light of generic collection; this is broken - a catagoy IS a concept
  • Code - review subcatagory in light of generic collection
  • Concept - definition is a footnote not a definition; review isSubClassOfConcept in light of generic collection
  • ConceptualVariable - conflict of hasConcept, hasUniverve as all are actually concepts - these are not extensions of a concept as they are in GSIM; do GSIM or don't
  • RepresentationMap - relates to correspondence map; does it belong here?
  • RepresentedVariable - conflict of hasConcept, hasUniverve as all are actually concepts - these are not extensions of a concept as they are in GSIM; do GSIM or don't
  • SubUniverse - review isSubClassOfConcept in light of generic collection; all are actually concepts
  • Universe - OK pending questions about subUniverse; isInclusive? hasGenerationProcess?
  • CodeList - review of definition; its a list of nodes
  • Nodec - *
  • NodeSet - isChild, isPart, basedOn, ...review mechanism in light of collections
  • CategorySet - review subcatagory in light of generic collection
  • ClassificationItem - explanatoryNotes? review relationships which should be properties with extended primitives
  • StatisticalClassification - everything related to nodeset needs to be done consistently


Process:
Modelers clean up what they can then bring remaining issues back to conceptual group.

Work plan:
Modelers group needs to meet weekly. Check to see if we can shift time an hour past current time (Arofan will check with members)



 Virtual mini-meeting 3 Sept

Attendees: Wendy, Larry, Jannik

Issue to note

  • In developing the SAS to DDI tool Larry has noted that Statistic only supports an xs:decimal
  • It would be helpful to have the option of an xs:double to support very large numbers
  • This could be particularly an issue for environmental or health metrics
 Virtual meeting 27 August

Attendees: Wendy, Flavio, Jay, Johan

 Procedural Issues

  • There was concern about time-lines for some of the content group activities, how much could actually be achieved by end of September
  • Informed them of discussion in AG regarding a change in the publication time- schedule
    • Will inform AG that other teams should be informed about the state of this discussion as it effects their work - we want them to keep working hard but not stress for no reason
  • As the modelers group we need a game plan
    • We have been working on specific topics but don't know how and when they will be tied together
    • Communications within modelers team and between modelers and other teams needs to be clearer so they will know when we are discussing certain topics and they can make us aware of upcoming topics

Things to address in a game plan

  • Capturing more of the conceptual information going into decisions which may be useful to other groups (i.e. Simple vs. Complex)
  • Use of sessions (calls) to discuss dependencies between packages and views
  • Develop an agenda to tackle those dependencies
  • Individual groups need a means of informing modelers of rethinking of content which may effect other groups
    • Example: Simple data description will revisit the instance variable, represented variable, etc to see if they are complete/do what needed for their use case
    • As soon as some idea comes through that touches other packages it needs to be put on the modelers agenda
    • Need an input structure for communications
  • It has to be someone's role to manage the agenda of the group - don't need a modeler to do that
    • The modeler creates the agenda item but the manager gets it appropriately slotted into the agenda and does the follow-up to make sure it is addressed

ACTION ITEMS

  • Inform AG that it would be useful to share information on revision of schedule even before finalized - realize the message is important, but teams  need a head-up that they aren't looking at as strict a deadline as currently (Wendy)
  • Send out a list of what needs to be in a game plan to the modelers for discussion and expansion (Wendy)


 Virtual meeting 20 Aug

Attendees: Wendy, Oliver, Larry, Jannik

Update on elements from other standards

  • This was part of a broader discussion in the AG and there will be a statement out soon regarding current agreements and points that require clarification or implementation
  • Relayed the general consensus from is and earlier discussion that with the possible exception of primitive types and xhtml, DDI would no longer use native elements from other namespaces in its model
    • xml has been included in Drupal so they can be used as data types but the folder of individual objects is not uploaded to EA for inclusion as DDI namespaced objects
    • Dublin Core will no longer be used as native elements which will require the clear mapping of all DC elements we need to be able to populate from DDI
  • Relayed the sense of the discussion on the need for clear means of capturing equivalence and similarity and how this would work to ensure that it syncs with the process for producing the RDF binding

TOPICS for next meeting:

  1. Equivalence - could people with some knowledge of how technical side works for relaying information for RDF binding in particular; how does this get captured for documentation purposes
  2. Review of Report to Modelers - accept? edit? what are the implications in terms of what needs to get done
  3. What is the check list for getting a prototype (package one)

Flesh out and add to the following base list:

  • Finish reviewing the individual package content
  • Cross review for consistency
  • Implement whatever is required by accepting/editing the Report Modelers
  • Implement the means of indicating equivilent objects in other namespaces namespace
  • Technical workflow of moving from Drupal to Bindings

To Do's
Automatic output from EA to xsd and rdf and creating several pictures from Drupal still needs some refinements

Oliver will look at the technical issues - for capturing information in Drupal (see item 1)
Review and OK or raise specific issues for discussion (see item 2)
Automatic output from EA to xsd and rdf and creating several pictures from Drupal still needs some refinements - Jannik will bring this up at the Tools meeting tomorrow

 Virtual meeting 6 Aug

 Attendees: Jon, Larry, Wendy, Jannik

Simple vs. complex in discovery and instrument packages

Agreement upon: More complete package instead of simple and complex versions of packages.

In terms of developing the packages the iteration process from simple implementation towards a more complete complex ending.


Discovery

Presentation by Larry of Discovery spreedsheet from email 20140803 containing DISCO elements and discussion

Agreement upon including the missing elements from DISCO not presently in DDI4


Simple vs. special case

E.g. is discovery a special case ?

Views define special cases


Special cases:

  • E.g. describing information from a special software package
  • E.g. describing data from a special field of science e.g. outer space science


Elements from other standards

Reuse of elements from other standards:

  • DCTerms
  • PROV
  • FOAF


How do we reference other standards in Drupal

  • Yes as RDF


How do we implement this in the model ?

In the RDF implementations it is strength forward via equivalent

In XML there are some possible implementations that can be persuaded in the implementation part

  1. Bring in the namespace and use the element directly
  2. Use a DDI4 namespace and deliver a separate documentation with the mappings that way systems can export to specific other elements
  3. And not going the ddi3 way with a mix of both 1 and 2


A as the implementations are derived from the model a possible solution is to make both implementations in the XML space and decide later via reviews.


Review from documentation prospective

Quick look revealed room for improvement.

Documenting views would improve by clear use cases.

Inheritance of documentation from other views how is this solved.


Homework

Check up upon the TIC Name/Label/Description discussion


 Virtual meeting 24 July

Attendees: Jay, Larry, Achim, Oliver. Thérèse


Discovery

The Discovery view is a bit confusing at the moment. There is a view called Discovery and there are also two package - Discovery and New Objects for Discovery. It seems like 3 disjointed chunks. What should there be? The view can only reference objects that are already existing (because all it gives you when editing is a checklist of all existing objects. So for the moment there should be a view called Discovery and one package called Discovery. The Discovery view may reference objects which are in packages other than the Discovery package (for example conceptual objects).

There is a great deal of cross over in the objects. Larry will start to clean up discovery

Action: Larry will start to work on the discovery view.


Mapping to other standards

The discussion about discovery raised an issue about how we are relating to other standards. Currently no mechanism at model level to map to other specifications. How do we eliminate duplication with Dublin core?

Should we externalise the mapping to other objects or do the mapping within our model. We could map outside of the model. This raises the question of what happens if there is only a representation in XML or RDF. Or we could put them in model where in both XML and RDF representations exist.

It depends on the stability of the standard we are talking about. Ok for DC but problem for less stable standards or standards that might be replaced (foaf?).

We could do mapping via Structured comments in object description? But what about cardinalities? Perhaps we could add a column in Drupal to note things like "mappable to Foaf: person".

We should write a proposal for how to handle this.

 Action: Oliver and Achim to make a proposal about the related standards to circulate to group. Larry would like to be involved in the initial discussions as well


An extended discovery view

The current discovery view does not cover everything. There were some other things that people want. For example role of agents or of use of variables in data analysis model, hypothesis. This should be added to the product backlog.


Action items from last meeting

Action: Arofan will go through the Agent package and clean it up. - Arofan not at the meeting


Action: Remove Service object and fix relationships to Machine and fix verb tense in relationships in Process package

Action: Create "Complex Process" package and move objects that are not part of the simple - Jay - Jay will do this next week

Action: Arofan to lead the writing on the paper on process. - Arofan not at the meeting


Action: Rename Node in Drupal - Jon DONE

Action: Ask Classifications team to help with GSIM mapping work. Classifications meeting is next week, will ask then


Action: Wendy to frame discussion on universe, population etc.

Wendy circulated a document (Object_Universe_PopulationUnit). This gives some descriptions of the objects but there is a more to discussed. For example, the distinction between target universe and the universe actually measured and relationships to represented variable and instance variable work. Does the term “Population” tie DDI too much to surveys of people?


Action: Oliver will have a look at the variable objects.

Conceptual variable and Represented variable only description and name  - do these need other properties? Where is instance variable? In which package should this live? At the moment it is in Simple data description?

The InstanceVariable / field / variable / column issue needs discussion. We need to be clear for end users of DDI4 about the distinctions between represented variables and instance variables.

Jay has example of conceptual variable (structural equation modeling latent variables are conceptual, manifest variables are represented)


Next meetings

Flavio is going to be the modeller for the Classifications group so we will ask him to join these meetings. We need to set a regular time for these meetings. Thursdays are no good as Wendy can't attend. Thérèse will circulate a poll trying to find a better time.

 Virtual meeting 11 July

Attendees: Jay, Arofan, Larry, Oliver, Wendy, Jon, Marcel, Thérèse


Notes:

Agent

Oliver looked at the Agent package in Drupal. He noticed that there was a pattern in what was modelled. Entities with properties that may change over time had been externalised to a property container (example was Individual Name Type). This container can be changed over time. The other approach is to collapse everything into the main object (Example: Individual). It was pointed out that there is a design principle that states that we only model "real" things. A decision was taken to simplify/collapse the objects in the Agent package.

Action: Arofan will go through the Agent package and clean it up.

Process

The Service object is a duplication of the Machine object in the Agent package. The Service object should be removed and the relationship made to the Machine object.

Some of the relationships are expressed in past tense. These should be changed to present tense

It was agreed that the objects in the Process package should only support a simple process. There are a number of objects that relate to the paralell processing use case (example: Split, Split/Join). These should be moved to a separate package called (for the moment) "Complex Process". A paper should be written that shows how to describe processes in other views. Arofan will write it, with input from Jay. Steve McE would also be a good person to be involved. Larry is interested in reviewing the paper.

Action: Remove Service object and fix relationships to Machine

Action: Fix verb tense in relationships

Action: Create "Complex Process" package and move objects that are not part of the simple - Jay

Action: Arofan to lead the writing on the paper on process.


Conceptual

There is a problem having an object called "Node". It breaks the graphs in Drupal. It was agreed to rename the object "Nodec" until a more permenant solution can be found.

There are some issues with the alignment of the DDI 4 modelling and GSIM. There is sometimes a conflict between GSIM and the way it works in DDI 3.2 now. A decision has to be made in each of these instances as to why there is a difference (wither from GSIM or DDI). We need people to go through and carefully check each object.

There is a conceptual variable and a represented variable, but we could not see where the instance variable is. It looks like it belongs to another team (Simple Data Description). Oliver will have a look at what is going on.  

There was a question about the Universe, Population and Units objects. We need to make sure that we get these basic objects right. A discussion should be had with people like Arofan, Wendy, Jay, Dan G and Jenny Linnerud. Jenny had some problems implementing these objects from GSIM so may have some useful info to add to the discussion. Wendy will write something to frame the discussion.

Action: Rename Node in Drupal - Jon

Action: Ask Classifications team to help with GSIM mapping work.

Action: Oliver will have a look at the variable objects.

Action: Wendy to frame discussion on universe, population etc.



 Virtual Meeting 17 June

Attendees: Jay, Larry, Jannik, Johan, Oliver, Wendy Jon, Thérèse


Notes:

The Agent, Process, Conceptual parts of the library and the discovery view were created in Drupal during the Toronto sprint. They are now ready to be reviewed by the modelling team. There are notes on the wiki which should help you to know what needs to be checked in Drupal and what conventions have been agreed (see: Modelling Team Meeting Minutes). Wendy will review this document to make sure that it is up to date and correct.

The review works was allocated out as follows:

  • Agent = Oliver
  • Process = Jannik
  • Conceptual = Jay
  • Discovery = Larry

Jon will look at everything from a documentation rather than modelling perspective. Johan will go on holiday (tongue)

The group agreed to meeting in 2 - 3 weeks to discuss results of the reviewing work.


 TC modeling related meeting 7 August

DDI TC Meeting Minutes

2014-08-07 

In Attendance: Wendy Thomas (organizer), Johan Fihn, Dan Gillman, Arofan Gregory, Larry Hoyle, Jon Johnson, Flavio Rizzolo, Dan Smith 

Secretary: Elise Dunham

AGENDA

Moving forward: grouping mechanisms

HOMEWORK

  • Look over Wendy?s Groups/Collections of Study Units document. (Sent to DDI-SRG on 2014-08-06)

DISCUSSION

Modeling Bags

  • One idea: define order at the item level. Rather than asking the collection if it has an order, ask the items if they have relationships to one another.
  • Reasons for asserting order at the group level:
    • An item may belong to multiple groups.
    • Groups are reusable.

Extensions

  • Need to identify the generic bag types. Already know we need one basic bag and one for order.
  • We can consider a certain set of privileged bag types (controlled vocab) and then allow people to make their own.
  • Another requirement is a non-maintainable group, ex: for things that are transitory during production.
  • Type-extension of bags will not require that all possible collections be specializations of the defined generic bags. For example, questionnaire objects will maintain their own semantics and function outside of these specializations.

Schemes in DDI 3

  • Need to walk through DDI 3 and identify schemes and groups that are actually modules/views (artificial constructs). We won?t be maintaining this kind of thing in 4.
  • Need to identify what we modeled in 3 as groups within schemes and test using 11404 concepts.
    • Model the minimum bag, and pile on top of it to explain the kinds of grouped things, then extend to other grouped things already in the model.
    • Take all the grouping requirements from 3 and move on to include all the additional things we want to group in 4.

Communicating with Modelers re: Seeming Duplication Across Teams

  • The top-down approach makes duplication unavoidable during initial start-up phases.
  • Behooves us to get through the start-up phase quickly by communicating effectively with content teams and documenting decisions.
  • This is an important discussion, and it belongs with the modelers group. Not for TC agenda.


NEXT WEEK

  • Walk through the types of collections we need
  • Discuss how many/what forms of basic bags we need
  • Discuss process for developing modeling guidelines on grouping mechanism.
 TC Meeting 31 July regarding modeling

DDI TC Meeting Minutes
2014-07-31

In Attendance: Wendy Thomas (organizer), Dan Gillman, Jay Greenfield, Larry Hoyle, Flavio Rizzolo, Achim Wackerow

Secretary: Elise Dunham


AGENDA

Moving forward: grouping mechanisms


ACTIONS

Wendy
-Revisit documentation & work done on pulling together use cases around groups of study units and different ways of grouping study units. Start a thought piece and send around so others can review and add on.
-Assist Achim with DDI -> 11404 & Smalltalk mapping if needed.

Achim
-Look at existing groups types in DDI 3 and experiment with mapping to ISO 11404 and Smalltalk.

All
-Send out any thoughts/musings on grouping mechanisms over the next week.


DISCUSSION

Bindings

-Jay sent a document highlighting the importance of de-coupling data models from bindings/encodings. There’s agreement that the grouping mechanisms discussion needs to happen from the modeling perspective rather than the binding; doing bindings is something that should happen independently of the modeling work.
-It’s important for us to be aware of the struggle associated with transitioning from XML to UML—when modeling from an XML perspective binding and modeling happened together, but now, in UML separating them is one of our design criteria.

Abstract Bag

-Need to agree on the features of the bag and need to come up with extensions on that for specific types of collection purposes.
-In order to frame this discussion and bring it from the abstract to something practical, Achim volunteered to map group types on DDI 3 to 11404 and SmallTalk.
-"Why are we grouping things?" needs to be one of the driving questions framing this discussion
and informing decisions.

Level of Extensibility
 
-How do we limit application of these grouping mechanisms to prevent creation of groups that shouldn’t be created? Ex: ResponseDomain and CodeList don’t need groups around them. Is it possible to identify a limited set of groups, or is it not feasible to identify all of the use cases? Should we leave the opportunity to use a group for a purpose that is currently unknown?
-Achim’s mapping will help frame the discussion on striking this balance.

Challenges of UML -> Relational Mapping

-The conceptual constructs we’ve been using so far won’t be able to be expressed in relational. The aggregations, additional semantics, doesn’t map cleanly, the more object-oriented it becomes the more problematic it will be for relational expression. It’s something we need to consider.
-Would IDEF-1x be better?
-This is something we should keep on the radar and perhaps bring to a different group for consideration.


NEXT WEEK

Continue grouping mechanisms discussion



 TC Meeting of 20140612 regarding Modeling Issues

DDI TC Meeting
2014-06-12 
In Attendance: Wendy Thomas (organizer), Johan Fihn, Dan Gillman, Jay Greenfield, Jeremy Iverson, Jon Johnson, Dan Smith 
Secretary: Elise Dunham 

TC used this meeting time to advance work commitments to the DDI 4 Moving Forward Project 
AGENDA: 
Name, Label, Description
Identifiers
Containment and Reference/What do we reference?* 
*TC decided to begin referring to the Containment and Reference portion of their work as "What do we reference?" at the 2014-06-12 meeting. 

Actions
Dan Gillman: Write up a document about his ideas on approaching the name, label, description issue in a rigorous manner 
All: Review and send in feedback on GSIM 1.1 to DDI 3.2 mapping document 

Discussion
Name, Label, Description
Frame of discussion/background: in previous versions of DDI, we've used ISO/IEC 11179-5 as the basis for our approach. This included a set of 3 objects: Name, Label, and Description. For Name we used a NameType, with each object having a specific name of type="NameType" (i.e. VariableName, ConceptName, etc.) We've made the decision to just use Name as a direct property of the object. Questions are:

  1. What objects should we be using this with?
  2. Is the full triple always required?


Dan Gillman would like to see us put more structure around names, concepts, labels, and designations. He will write up a document where he'll  lay out his ideas on this issue to give a sense of what the distinctions are and how to structure them. Aiming for early next week; send any questions to Dan in the meantime. The group consents to this plan and will use this document when making its proposal to the modeler group. 
Identifiers
Dan Smith wrote up a document on this issue and discussed each section of it with the group. Highlights from this discussion:

  • Group agrees on requirements Dan outlined in this document
  • Character restrictions: in 4 we will lift character restrictions to allow for support of localized identification schemas. The historical reason for character restrictions is now managed by the ability to capture a DDI lifecycle URN in 2.5: identifiers in 4 will still be able to be represented in Codebook.
  • When we get back to working on administrative metadata we need to pay attention to potential overloading.
  • XML serialization and RDF serialization: Dan has outlined 2 potential strategies for doing this, and says it could be done with either approach or a combination of them. The group responsible for determining the approach are the groups doing the transformations.
  • Dan's document is ready to be presented; it's meant to describe the how's, not the what's—that's for the modelers; the how and the application are intentionally being treated as two separate functions in this workflow.


Containment and Reference/What do we reference?

  • Will discuss fully in another meeting.
  • Changed what we refer to this discussion as to "What do we reference?"


Other TC Business
Dan Smith put out a call on the DDI-user list that starts the mapping of the GSIM 1.1 to DDI 3.2. He asks that everyone look that over and send in any comments. If there's feedback, Wendy will put on the agenda for next week's TC meeting. 
Plans for Next Week
Back to TC 3.2 work: Sampling, weighting and questionnaire development.