Info

icon	false

Expand

title	30 November 2017 meeting minutes

Notes of 30 November meeting

Attendees: Dan Gillman, Larry Hoyle, Steve McEachern

Discussion began with a discussion of the distinctions between precision and number of digits, and then similarly between intended and physical data types. (Note that there are differences that are introduced by the choice of platform).

Larry provided an update on the analysis of the DD model that he had used in rendering the Australian Election Study codebook in DDI4.

There was one outstanding item to be resolved: Number of variables

Discussion:

is this a specific case of something that could assist for collections (number of items in a collection). Could be added as an attribute. Dan noted that "numbers of items" may not be static - so may be difficult to include.
Larry suggested that this could be tied into VariableCollection (the former VariableGroup). Steve suggested that this is conflating two issues - number of variables in a record (our specific problem) and VariableGroups.
Dan suggested instead to include this as an optional attribute in the Collection pattern. (Could consider static versus dynamic in the future).

To be added as an issue for the Modelling group - suggest adding into SimpleCollection.

As part of the discussion, the group identified that there was a need also to file an issue to allow representation of groups of VariableGroups (and more generally of "Groups of Groups").

Both issues were filed for the Modelling group.

At this point, the Data Description group is satisfied that the model is sufficient to support the requirements of the DDI prototype, and is ready for handover to the Modelling group.

Proposed next meeting:
Hiatus in December
Return to meet week of January 8th
Meeting day and time to be discussed and confirmed.

Expand

title	5 October 2017 meeting minutes

Data Description Meeting - 5 October 2017

Attendees: Larry Hoyle, Dan Gillman, Dan Smith, Steve McEachern, Jay Greenfield

Agenda for this meeting to outline basic work program for Dagstuhl.
Key priority for Dagstuhl is to finalise links/interactions between DD and Data Capture.

Questions on Larry's work

To facilitate this discussion, Larry walked through his slides (JIRA issue 20)

1. Is Datum "the thing we have written down", or "the thing we are observing"?

This is an application of the Signification pattern
Signifier - the string - FormatDescription
Signified - the thing that is being represented (the label/handle) - Observation (both are NOUNS here)
Sign - the association between the Signifier and the Signified - Datum

2. Is Datum in the LogicalRecord or in the PhysicalRecord?

Datum - the Sign - is in the LogicalRecord
Signifier - the string - is in the PhysicalRecord (in the FormatDescription)

3. Is DataPoint in the LogicalRecord or in the PhysicalRecord?

Further comment from Dan Smith: Trying to clarify where Observation comes from. It is not in the DataCapture model or in DD. Later discussion suggests that Observation is probably a process - we will need to develop this in Dagstuhl.

Touchpoints for DataDescription and DataCapture

Proposals coming from the DataCapture group:

1. When creating a ResponseDomain for use within either RepresentedMeasure or RepresentedQuestion, would like to be able to reference a RepresentedVariable in the cascade

You could have multiple domains joined together in DDI3 - proposing for DDI4 a 1-to-1 relationship between ResponseDomain and a Value Domain associated with a RepresentedVariable.

Note also that Capture is REUSABLE - and therefore Capture is REPEATED and PROSPECTIVE.

2. After data has been collected, how do we say where it has been collected from.

This is the RETROSPECTIVE case.

DC have not gone to this in the Capture model, but there is the capacity to record the SourceCapture in an InstanceVariable.

Data collection would have to be done as a PROCESS. However we do want to ensure that the InstanceVariable is able to point to the Capture that created it.

Should this be an InstanceCapture? Dan Smith suggests probably yes.

Dan G suggests one means for traversing the questionnaire by working up the cascade to the concept. Dan S suggests that there is actually a graph - one through the Cascade, and the other through the Instrument to the Concept (finding all the Data that have been collected from this instrument).

There is still the open discussion of where the Sentinel values should fit in the cascade - Dan S suggests putting them at the Definition level rather than the Usage level.

Larry also noted that we still need to keep in mind Units (particuarly changing Units, e.g. in data harmonisation)

3. Common data elements

(This is coming from ICPSR project. Jay also identified the NIH example of this: https://cde.nlm.nih.gov/home)

These are definitions that combine Questions with Representations. Dan S suggests that we explicitly model this in DDI4.

The CDE is the Item (RepresentedQuestion or RepresentedMeasure), plus its ResponseDomain, plus the RepresentedVariable it creates. May also want the ConceptualVariable.

This could be a View, which brings together the relevant content from DC/DD, etc.

Work program for Dagstuhl:

1. Addressing the above touchpoints 1 & 2

2. Reviewing and resolving the maturity issues identified by Jay in LogicalDataDescription

3. Exploring the CommonDataElements use case (item 3 above). Feeds into UseCase program in Week02.

Work process for Dagstuhl: Jay notes that we don't have everyone in the room. We will need to coordinate possible dial-in times at Dagstuhl. Noted that end of day Germany is start of day in US (4pm in Dagstuhl is 9am in Minneapolis).

Next meeting:

To be confirmed - will be early November after Dagstuhl workshops

...

Expand

title	14 January 2016 Minutes - Data Description Meeting

Data Description meeting, 14 January 2016, 2100 CET

Attendees: Barry Radler, Flavio Rizzolo, Dan Smith, Jay Greenfield, Ornulf Risnes, Steve McEachern, Dan Gillman (from 21.40 onwards)

Apologies: Larry Hoyle

There were three outstanding questions from the previous meeting designated for discussion - see previous meeting notes below.

1. Relationships between DataPoint and DataStructure

It was agreed to remove the relationships between DataPoint and DataStructure

sppcifiesOrder
specifiesIdentifierOrder

And then add relationships from DataStructure to InstanceVariable - the same two relationships above

Questions on this point:

Query from Flavio - link to DataRecord or DataStructure? Dan S. argued for DataStructure, as all DataRecords in a structure are the same - AGREED.
What does DataRecord provide then? Groups together different Measures, Identifiers and Attributes with specific roles. (Note that DataRecord needs a clarification of the definition). Ornulf clarified that the original point of the DataRecord was to group the combination of Datums (each with it’s InstanceVariable) and their Roles into a Collection.

Dan’s argument: DataRecord and DataStructure store data, but Viewpoint stores relationships

Flavio: DataStructure has homogeneous DataRecords only (confirmed by Ornulf)

THUS - need to add to DataStructure definition that it is a homogeneous set of DataRecords.

Agreed that the following needs to be added to the model documentation:

A DataStructure can have no DataRecords and therefore no DataPoints - i.e. no records yet collected. It must however have IVs to define what the DataRecord should look like.
A DataStructure is a Collection of homogeneous DataRecords
A DataRecord must have DataPoints.
The DataPoints are then populated with Datums
Ordering of IVs would be OPTIONAL (not always appropriate in a Logical structure)

Further questions:

Dan: How do we associated specific Viewpoints with the DataStructure?

Jay: Can a Viewpoint describe, for example, an RDF triple? Dan suggests that this might be possible to do with the use of Roles (e.g. Predicate is defined as an Identifier role for an IV)

Ornulf noted that some of the uses here are documented in the paper from he and Dan authored at the Dagstuhl sprint

https://docs.google.com/document/d/1-vxWdastNsTWMf8qlR35wj1128FNSX-4YBrA_MJBaLk/edit

Different Viewpoints could be layered on top of the DataRecord. You also don’t necessarily need to use the Viewpoint.

Dan S. noted than that three layers that can be used:

Logical description of a DataStructure
DataRecords and DataPoints
Viewpoints

You will always need to use the DataStructure, but the other two will be optional

DataStructure will therefore have the following relationships:

Viewpoints (0 to Many) associated with a DataStructure.
DataRecord (0 to Many) associated with a DataStructure.
InstanceVariable (specifies Order and specifiesIdentifierOrder)

2. ORDERING:

Agreed that Ordering of DataRecords in DataStructure should be possible but OPTIONAL.

Ordering of InstanceVariables in a DataStructure still needs to be clarified.

3. Usecases

This point wasn't covered directly in the discussion. Agreed that there is a need for testing usecases against the model now, but need to finalise the clean-up of Lion (per Wendy Thomas's review - see minutes below). Agreed therefore that Flavio would update Lion/Drupal, and we would have a special meeting Monday Jan 25 to review this, ahead of the regular meeting on Jan 28. Steve, Jay and Flavio will convene the review meeting, with others welcome if available.

Actions:

Flavio to update the model, and then Flavio/Jay/Steve to meet and confirm. (Special meeting invite for Monday week meeting).
Flavio to circulate model updates to Dan G as well.
Dan G. to review his position on Datum reusability, in light of model updates

Next meeting(s):

a) Review meeting Monday Jan 25th, time TBC.

b) Regular meeting Thursday Jan 28th, 10PM CET, GoToMeeting:

https://global.gotomeeti ng.com/join/148887013

(Note that meeting time will return to CET 10pm for next regular meeting.)

...

Expand

title	17 December 2015 meeting minutes

Meeting minutes 17/12/2015

Attendees: Dan Gillman, Jay Greenfield, Larry Hoyle, Steve McEachern, Barry Radler, Ornulf Risnes, Chris Seymour, Dan Smith

Dan Gillman opened with a review of the PPT he provided earlier this week on “Tracking Datums”.

Key points in Dan’s proposal:

a DataPoint should exist only if it’s “parent” (a DataStructure) exist.
Datum is misnamed (it is actually a group of things)
DataPointInstance is the association of a Datum with an InstanceVariable
ValueDomain in the model could be either Substantive or Sentinel

Jay: What about the collection of copies of the Datum? What is this thing (if not Datum)?

Larry: How do we identify the particular Datum that is put into the DataPointInstance

Jay: asking does Dan want a class to indicate that all of the Datums represented the same conceptual thing. Dan agreed.

Ornulf: if we have access to the Variable Cascade, can we infer the relevant concepts associated with the Datum?

Ornulf: What does this add that we don’t already have?

Dan: didn’t think we have a coherent way of talking about this from the perspective of the DataPoint.
Ornulf indicated that he believes we can navigate much of the content in Dan’s model using the existing model
Dan: argues that the current model conflates the DataPoint with his new DataPointInstance
Dan G and Dan S both argue that the model doesn’t allow us to talk about an empty DataStructure. Dan S notes that DataPoints are NOT reused as currently specified - this apprears to be a point of clarification needed between Dan and Ornulf’s interpretation of the model
Ornulf: DataPoint is related to a Record and to an InstanceVariable
Dan: as soon as it is associated with an InstanceVariable, a DataPoint has a relationship with a single Datum.

Jay’s interpretation was that the RHS of Dan’s model could improve the model, the LHS is more complicated. Suggests that there are two roads:

Does this improve what we have?
Assuming that we understand that we are storing an individual copy, ... (missing some detail on this point - please add comment here)

Dan: aim of his model is trying to associate a copy of a Datum and an InstanceVariable into a DataPointInstance.

Ornulf: not comfortable with where we are at. He argues that we CAN re-use DataPoints, and that we can track DataPoints (he is currently doing this in RAIRD). Dan asks can Ornulf reuse STRUCTURES. Jay suggests that what Ornulf is doing is actually using DataPointInstance (but naming it DataPoint, as is currently in the model). The question here is fundamentally about reusability.

Larry: Is what is "in" the DataPointInstance a Signifier? And is DataPoint the LOGICAL and DataPointInstance the PHYSICAL?

Dan: key argument is that we have the concept we want to represent (e.g. the NUMERAL five) and a series of strings that signify the concept (e.g. different strings of 5, IV, ...)

Conceptual: the NUMBER five
Represented: the SIGNIFIER - the NUMERAL five
Instance: the actual written down recording
(COMMENT FROM STEVE: Colleagues - have I got this right?)

Dan: what isn’t currently covered is the fact that DataPoints can be RE-USED. Ornulf argued that he thinks that’s covered, but Dan's position is that we don’t yet have the “empty bin”.

Dan S./Larry: are we talking about the difference between a logical and a physical, between empty and populated, ...?

(Dan G. left the meeting at this point)

Dan S. suggests that everything that Dan G. is covering is represented in the current version of the model in Lion - in particular, we can address a DataPoint from the InstanceVariable and DataRecord

HOWEVER, Dan S. did have a concern that Ordering in the DataStructure is ordering DataPoints. Dan S. suggests that ordering should be of InstanceVariables. Dan S. argued that DataStructure relationship should be to InstanceVariables rather than DataPoints.

Larry asks whether the relationship should be between the DataRecord and InstanceVariables. Dan notes that if the Record complies with the Structure, then that isn’t necessary.

Questions for discussion at the next meeting:

Dan S.’s solution of realigning the relationships from DataStructure - by removing the to DataPoint and instead making the relationship from DataStructure to InstanceVariable) possibly addresses Dan’s concerns. Dan S. also noted that this would also allow the ViewPoints, Attributes to become OPTIONAL in specifying a logical structure. Comments requested on this.
Ordering concerns need to be taken into account - Ornulf argues that this doesn’t really make sense in a LOGICAL structure. Previous discussion (from Flavio) is that possibly it could be OPTIONAL. Any comments?
Jay: it would be useful to have USECASES to reflect the uses of the required (IV/DS) and optional (VP/DP/DR) parts of the model. Suggested for Jay to look at the openEHR case. Could others volunteer for the simple CSV case? (Steve happy to coordinate of the CSV group - would be nice to align/compare this with the new W3 TabularStructure: http://www.w3.org/TR/2015/WD-tabular-data-model-20150416/ ).

Next meeting:

January 14, 2016. GoToMeeting: https://global.gotomeeting.com/join/148887013

Proposed time is ONE HOUR EARLIER - 2100 CET. Steve to poll group members about this.

NOTE ALSO NO MEETING DECEMBER 31

...

Expand

title	MPLS Sprint 2015-05-26 Morning Meeting Minutes

HOW FAR DO WE WANT TO GO WITH WHAT WE DESCRIBE?

Jay has put together a deck and has a proposal

He is modifying GSIM model of “data set”

First thing that’s interesting is that the way GSIM represents attributes, it doesn’t give them a possibility of giving them a structure. We’d want to modify it so it could have a structure.

This would be a hook to enter what Larry and Arofan are doing.

[See the ppt]

Discussion took place about what defines 1NF/3NF in the GSIM model and Jay’s propsal. But does it matter or can the terms be changed for description?

The description that Jay proposed makes sense, but terms should be changed to avoid NF’s.

Attributes need to be worked into the GSIM model as they are variables. There are variables in the attribute sets.

LARRY - In DDI do we want to model a datum as a collection of variables or a single variable?

DAN – it’s a single.

LARRY – but then Ornulf describes a datum as a collection of variables

So what are the terms to be used if we’re calling a datum a single variable?E

Datum

Data Structure

“Datum Structure”
1. identifier(s)
2. measure(s)
3. attribute(s)
4. Logical records
  1. Measure(s)

Coming back to Jay’s stuff this morning.

2 different types, logical record and the basic idea of key value pair

(reordering above)

Logical records
Key-value pair
Datum structure (which builds a logical record)

Would the key-value pair be possibly triples? Graph data?

Where are we in relation to the work done yesterday? We have a basic structure to then describe a CSV file.

DAN - What could be called a key-value triple which contains a variable (attribute), unit (ID), value (measure). (There are parallels between this and the datum structure.) So this a the fundamental thing. Let’s use that to define a record, and from that define a CSV.

Record is an ordered set of these key-value triples (“kvipple”) that share the same unit.

Larry making a proposal

We’ve got this record which has 3 collections associated with it: ID, Measures, Attributes.

Record, ID, Measures, and Attributes are all collections.

Then we want to define a structure of records. That can be instantiated as a dataset

RecordSet is a set of Records (a sub-class of collection)

DataStore store of a RecordSet

STEVE - Can we describe a CSV at this point?

Moving from RecordSet to DataStore we move from logical to physical. We have separated the logical and physical forms

A CSV is one type of DataStore, and all the logical parts are in the RecordSet. Fixed Format is also another of DataStore.

What does a Key-Value Triple option look like? How can this work with aggregated data.

GSIM didn’t try to tackle them all under one structure; are we trying to do it with one?

We can use the basic model of building this up, but we have to interpret it differently and have different relationships associated with it in the case of aggregates.

We need to solve the problem of dimensional data.

Take the combination of the values of each of the dimensions; every combination defines a different cell. Applied to the unit type in the micro data, itself defines an aggregate unit.

Record: Cell

Unit Type (e.g. “people”)

Dimensions (e.g. “age”, “sex”)

Measure (e.g. “income”)

Key: 40 y.o. male plumbers (1. . n components)

The component could be represented by variables

Each kvipple is a cell. And every cell is a record. The unit incorporates the key.

Are we losing the dimensions?

Does the model work?

The only thing that’s really changing is the idea that the unit is going from one kind of object to an abstract collection object. It’s the set as a completed set, not as the individual element within – is the unit.

The dimension isn’t lost; it’s a combination of aggregated variables.

Unit + dimensions+ variable + value = Key

The unit is shared by the entire cube. It describes the characteristics of the entire population. (working with census data)

For the microdata dimensions are constant (e.g. person). For the macrodata the unit is constant.

Key is M,40. Variable is income. Value is 27,000

Is the unit the cube or the combination of things in the key.

What is the unit?

In a microdota case each cell is a record.

The unit is identified by the key; it’s the interpretation of each cell.

Dimesional data takeaways:

There’s something going on here

Units either by groups or individual they mean different things. The unit is dependent on the key.

What's the unit of analysis? The unit of the cube or the unit of the cell? What do we want to do with it?

The unit question - the answer lies in where we attach more information.

We want to put in rules for putting together different slices to put together the RecordSet in the unit.

We need to say what the "thing" is before we put everything together.

Need to look at how datum is described from the point of view of the variables.

The following email and links were provided by Ornulf following the call:

Regarding the question of relations; we've lately come across some interesting thinking in what seems to be an alternative (and more forgiving) way of Data Warehoursing; Data Vault Modeling:

http://en.wikipedia.org/wiki/Data_Vault_Modeling

They have this distinction between Hubs (Units), Satellites (Datums) and Links (relations between Hubs) that looks pretty relevant.

Perhaps some of the participants have heard about this (and discarded it). If not, it's worth a glance at least.

Here's a slideshare that also goes into the newer "hyper agile" data vault solution, where satellites (datums) have a flattened-out structure:
http://www.slideshare.net/kgraziano/agile-data-warehouse-modeling-introduction-to-data-vault-data-modeling

If we strive for 3NF (not sure how I feel about that though) we definitely should take DW modeling into consideration.

Expand

title	MPLS Sprint 2015-05-25 Afternoon Meeting Minutes

In seeking to start creating a simple logical structure, we began by looking at the the 4 objects that had been created during Dagstuhl: DataPoint, DataStructure, DataStore, and DataStoreSummary. Also Dan Gillman began brainstorming a model of DataStructure along with the group.

Review of the DataStructure led to discussion if any parts of it needed to be reviewed and redesigned.

A DataStructure is an ordered set of DataPoints (a record). And a RecordSet is a collection of DataStructures (a table).

The discussion raised the issue of types of records and sequence of records.

Question – do we want to describe a very simple CSV (all DataPoints in a column are the same variable), or a more complex type e.g. a Household, Person structure with record type variables and sequence variables?

If all records do not contain the same sequence of variables then we need to describe record types and sequences.

...

Expand

title	Data Description Meeting Minutes 26 March 2015

DataDescription Meeting Minutes: Thursday March 26th, 2015

Attendees: Jay Greenfield, Dan Gillman, Larry Hoyle, Barry Radler, Ornulf Risnes, Steve McEachern

Jay walked through the thinking of where the current Process model is now at, and what had fed into the work so far. He pointed out that the model (and 3.1 generally) were based on our “traditional” model of questionnaires and datasets, but that now new datatypes are becoming commonplace and possibly dominant. Our recent work has largely been exploring these types.

Known cases we are now asked to support include:

Administrative data
Qualitative data
Experimental data

Jay pointed out that we need to take on board a new notion of lifecycle, or in other words, per Ornulf, there is more than one way to generate a datum. Dan and Jay both pointed out that in this “new world”, we have no clear paths to a datum. This is something that needs to be further fleshed out.

Dan’s comment: The logic for Questionnaire data is clear: question - observation - capture - datum. Other cases are less so. e.g. Derivation: generates data, but requires no question. Here the input is an existing datum.

Ornulf noted that a derivation has various characteristics: it has an input datum, a formula for the derivation, and a datum as an output.

Larry gave an example from a clinical psychologist in which a process is used to collect a combination of questions and observations, but the ultimate “thing” being recorded is actually the scale score as the datum. Barry noted that there are similar sections in MIDUS where the parts are not relevant, but it is the whole that matters.

Barry points out that the step between capture and datum (subsumed now within Observation and Process Step) is “hiding” a number of significant steps - but that we can probably draw on the strength of the process model to document this.

Jay considered a similar case of Computer Adaptive Testing which works from a battery of test questions to ask a set of increasingly difficult or easy questions, and that adapts based on previous responses. Dan points out that there are some similar cases in the survey community, and Barry gave a similar case of conjoint analysis in marketing, as did Jay in EHR.

It may therefore be appropriate to start digging into the process model to see if we can accommodate some of the above use cases using the current combination of Capture, DataDescription and Process.

Jay suggested that we should be exploring these in detail - and that it cannot be rushed. It would be useful therefore to now develop these use cases to test out the current version of the model version, to (a) assess the current objects and process model, and (b) determine what else needs to be included.

Suggested worked use cases:

Ornulf’s derivation process for RAIRD event data
Larry’s clinical psychology example
An administrative data example (Steve??)
Other suggestions??

Jay noted his work with Splunk here, where they are always aggregating and disaggregating from the datum level.Dan noted worries here about confidentiality in such a process. Jay also recognised this, but pointed out the access rights associated with each datum as one means to resolve this. Ornulf also had been addressing this solution in the RAIRD work, using statistical disclosure control on the end products.

Moving forward, it was agreed to take away these use cases, and start describing using the Capture/DataDescription/Process views. Example cases are given above, but it would be good to get additional cases of interest to the members of the group - particularly where group members are collaborating on cases. This work will require some extensive thinking, so the agreement was made to continue to work on these use cases, but to switch focus for our fortnightly meeting to the Physical Data Description.

Next meeting: Thursday 9 April. Time to be confirmed (due to Daylight savings changes in Europe and Aust/NZ)

Agenda will be to review and evaluate the current status of Physical Data Description. This will need to focus on:

The file description
The logical structure.

In preparation, it would be useful if team members could review the three pieces of work so far in this area:

The PhysicalDataDescription model developed by Ornulf, Chris, Justin and Achim in Dagstuhl: http://lion.ddialliance.org/package/newobjectsforphysicaldatadescription
The PHDD PDF vocabulary on the DDI Alliance website: http://www.ddialliance.org/Specification/RDF/PHDD
The SCOPE project document Dan has been working on with US agencies: https://dditools.atlassian.net/wiki/download/attachments/2588850/SCOPE%20-%20Metadata%20Element%20Set%20for%20Describing%20Variables%20-%20Updated.docx?api=v2

Expand

title	Meeting minutes 11 March 2015

Data Description Meeting 11/3/2015

Attendees: Steve McEachern (ADA, Australian National University), Larry Hoyle (IPSR, University of Kansas), Dan Gillman (BLS), Barry Radler (MIDUS, University of Wisconsin), Simon Lloyd (ABS), Ornulf Risnes (NSD)

We updated the progress since the last meeting, particularly the document Steve and Barry generated out of the "Linking..." presentation developed by Dan and Jay. This integrated model, bringing together the interface between Capture and DataDescription, is available here as a PDF, with the objects and relationships specified in the document available in the http://lion.ddialliance.org Drupal site.

Dan gave some initial comments on the model: What about those datums that are produced out of an observation that is not from a capture, e.g. a datum from a derived variable
Barry and Larry made the point that any observation is an outcome of a process - but that may not be generated by an instrument (e.g. generation of survey weights)
Complicating processes include: editing, computation, derivation, weighting
The term “observation” also alludes to originating from a physical source - where the above are not originating in the physical, while the machine generated processes
DDI 3.2 has a Generation as an output of producing a Datum from another machine data source - this might be a good existing option to draw upon
Capture and Generation would be sub-classes of a higher level class
Ornulf makes the point that this “first capture” versus later “derivations” may over-complicate the model - and may also create an artificial distinction
It may be the case that this distinction may be better defined within the Process group (as a “Processing Cascade”??)
The distinction between observation and generation would then arise when you determine where this arises in the processing cascade.
The class could also be a base class in the Conceptual view, an “UberDatum”

The general conclusion from the discussion is that the relationship between ProcessStep, Observation and Datum looks sound, but that the ProcessStep and Observation objects may need additional work in order to see if they are sub-classes of a broader type.

Thus the next meeting will explore further the requirements both Capture and DataDescription have for the Process model. In the interim, additional email discussion will continue around comments on the Capture-DataDescription link, building on Jay’s discussion of similar issues in HL7 and OpenEHR.

The provisional time for the meeting will be Thursday March 26 at 8.00PM Central European time. The GoToMeeting URL is:

https://global.gotomeeting.com/join/148887013

However given Jay’s existing work and his role with the Process model, which are the next step in our discussion, we will coordinate times around Jay’s availability if required.

...

Expand

title	March 17 meeting

2014-03-17 Meeting Minutes

Time:

15:00 CET

Meeting URL:

https://www3.gotomeeting.com/join/685990342

Agenda:

1) Status update. Where are we now with SimpleDataDescription? (ØR)

2) Clarify relationship between domain experts and modeler. Define role responsibilities, desired workflow in group (ØR, AW?)

Domain expert adds object descriptions and relationships

Modeler puts them into the overall model

Then iteration

What is the status of round trip?

Drupal to xmi to EA? Yes.

Is there machine actionable feedback into Drupal? No. It is possible but some work is required. It is not clear yet if there are resources for this task. Furthermore there are different positions on the issue if the roundtrip makes sense.

3) Identified issues with the current version (ØR/all)

a) Model is sparse on properties for InstanceVariable, RepresentedVariable, ConceptualVariable. Out of scope for this group?

Comments: These objects currently only exists in the SimpleDataDescription package. Discussion about GSIM/DDI 3.2 and who’s responsible for the “core variable objects”.

b) Do we need DataSerialisation (the physical counterpart of DataDescription)? DataDescription already relates to InstanceVariable, which relates to Field (column) in the RectangularDataFile. Because of this, a path exists from the Fields in the RectangularDataFile via InstanceVariable up to DataDescription and “TOFKAS”

c) DataSerialisation has no relationship to RectangularDataFile. If we decide to keep DataSerialisation, surely the relationshop to RectangularDataFile must be added.

4) TODO; Identify outstanding tasks (ØR/all)

Dan shares info on data.gov-Data dictionary
Dan shares a set of example data descriptions
Ørnulf pulls info from GSIM to produce candidate objects/properties for InstanceVariable, RepresentedVariables, ConceptualVariables
Larry shares findings/glossary for terms in extended attributes for SAS Enterprise Guide tool (below)
Ørnulf to suggest some “benchmark datasets” that can be used to document our work, and to “prove” that we are able to model a set of different data sets with our new model

Larry has been using a “difficult” training data set on tornados, and will share with this working group
Data file description is at http://www.spc.noaa.gov/wcm/data/SPC_severe_database_description.pdf
Data files are available at: Monthly Tornadoes Since 1950
Data files at: www.spc.noaa.gov/wcm/#jmc (scroll down to “Severe Weather Database Files (1950-2012)”)

Barry to flag potential issues from fieldwork with 3.2

Still a couple of months down the road

Ørnulf to harmonize minutes document and bring Larry’s notes in the right place
Ørnulf to try to arrange a meeting in April
Larry remembers to invite Ørnulf in case he’s needed for a virtual meeting during the NADDI sprint.

5) Assign responsibilities for outstanding tasks (ØR/all)

See above.

6) Plan milestones (based upon TODO-list, goals and availability) (ØR/all)

Overall milestone plan/timelines to be clarified during NADDI sprint. Thérèse Lalor (ABS) is currently the project manager for DDI4 - but only until July 2014.

Other notes:

...

Versions Compared

Old Version 139

New Version Current

Key

2014-03-17 Meeting Minutes

Agenda:

Page Comparison

Versions Compared

Old Version 139

New Version Current

Key

2014-03-17 Meeting Minutes

Agenda: