Data Capture meeting minutes
- Former user (Deleted)
- Kelly A Chatain (Unlicensed)
- Dan Smith
Sprint Meeting to finalize DataCapture DataDescription and Methodology (Dagstuhl, Germany) - 10/-18/15-15 thru 10-23-15.
DataCapture attendees: Barry, Flavio, Dan, Jon
Process Model Issues
The Process Model does not allow the use of sequences to operate in the same way as it does in DDI-L.
This requires a change to the process model so that a sequence contains a Process Step, and that the other structures in ControlConstruct can similarly be contained.
The lack of an "if-then-else" ControlConstruct would make many situations overly complicated, so it need to be restored back into ControlConstruct.
A process step can either be a Control Construct or an Act. Control Construct is a designates flow and Act is a particular activity such as the capture of data.
This was the outcome of the changes required to the Process Model
In addition, it was suggested that Split be added to the Process Model, this was a suggestion from Dan Smith (but the reason was not captured)
Changes to the Data Capture model
The model was revisited in light of the changes made to Drupal not reflecting the intentions at the last Sprint.
The main innovation was the evolution of thinking to retain Question as a "first class citizen" and create a similar structure for other forms of data capture. As a consequence, this meant that hasResponseDomain, and hasConcept could be moved to Capture rather than being repeated in Question, and hasInstruction and hasExternal Aid could also be moved from Question to Capture so that it is reusable across all Capture types.
The consequence of this is additional structures (a Question and a separate element called Measurement), but it would simplify / clarify documentation of Capture where both Question and another Capture were used sequentially, such as in the administration of a test within a survey.
Renaming Question to be RepresentedQuestion mimics / is analagous to the design pattern used in the Variable Cascade. As such both Question and Measurement have Represented and Instance elements.
More work needs to do be done on RepresentedMeasurement, so that it is able to support other non-Question use cases, such as protocol (blood pressure measurement), data linkage, etc. One promising alternative–one that might satisfy adherents of QuestionGrid/Block–is to extend Measurment with controlled vocabularies that describe the myriad of other non-Question and non-survey data capture types.
Reminder note: Not every InstrumentComponent has a Capture (e.g., it could just have a Statement or an Instruction). This is obviously important to have a flexible flow during the conduct of a Capture.
The following images represent the evolution of our thinking about the DataCapture model.
Flavio describing how a ProcessStep uses ControlledConstructs and Acts:
A snapshot of the model at the beginning of the week:
Then near the middle of the week:
And finally at the end of the week:
Complex Instrument (aka DataCapture)
Summary of Activities – Minneapolis Sprint, May 29, 2015
The ostensible goal of the Data Capture group was to extend our previous work on Simple data captures or instruments to Complex ones. The Simple-Complex dichotomy has long been considered a false one, in that there are no clear distinctions or characteristics that define a simple instrument from a complex one. Creating an information model for Data Capture has only further clarified this, in that the model dev eloped by our team the past year and half is not perfect, but is relatively comprehensive, parsimonious and (so far) robust enough to document many different data captures of differing complexity (http://lion.ddialliance.org/package/datacapture). We have been conscious of defining data capture broadly to include non-survey scenarios and processes – blood pressure data capture continues to be a useful exemplar and use case to model for our group and others.
DataCapture already had applied use cases of different complexity to the model during the Dagstuhl sprint (https://ddi-alliance.atlassian.net/wiki/pages/viewpage.action?pageId=491584). For the Minneapolis sprint, the primary goals were to compare the DDI 4 capture model to relevant DDI 3.2 objects and reconcile any differences. The work begun here was documented in a cross-walk (https://docs.google.com/document/d/1n1fu9naHRbsd5HVgKzDubsbltwtBWe0dUDRwT7TlWjE/edit?usp=sharing ) and could prove a useful template to other groups in ensuring the backwards compatibility of DDI 4 with 3.2, as well as identifying any gaps or oversights in DDI 4 models. This exercise resulted in (1) renaming of the Capture object to Measurement, and revising its properties to clarifying its broad non-survey nature; (2) specifying the properties and definitions of (nearly) all objects in t(he model; and (3) tweaking the cardinalities so that most objects are not required objects. Discussion was also initiated on the relationship between Capture and Methodology – there seems a relatively clear “fit” between the two at the Conceptual (Design) and Sequence (Process) levels.
Outstanding issues are:
- Clarifying the nature relationship to Data and Process models; further work awaits the refinement of these models
- The idea was floated during soap box that a small number of “standard” use cases should be created that are applied in each of the DDI 4 information models. This will ensure that ach group is not being developed in isolation of each other.
- Clarification of ResponseDomain and how this object (Representation model?) will be used to define data type
- Nothing much was resolved about the role that DDI 3.2 objects QuestionGrid and QuestionBlock will play, if any, in DDI 4.
Observation is the link the data description model.
- A datum is an instantiation of an observation.
There’s not clear distinction between simple and complex.
To accomplish:
- Throw more use cases at this
- Break it with more complex use cases
- Touchpoints – where it’s interacting with other models
- Linking each of these objects with it’s DDI3.2 counterparts.
- Making sure DDI3.2 can be represented in here.
- Backward compatibility
How do we go about doing this?
- Make sure all the classes from 3.2 are in here.
Is the model in 4 more generic than 3?
Start with ConceptualInstrument and ImplementedInstrument.
- Abstract has been removed from the description
- Useful to think of this and Implemented and at the same time.
- We should decide for both which properties makes sense
- And if the cardinality of these relationships is correct.
- Let’s “map” this to 3.2
- Is the cardinality off between Conceptual and Implemented?
- What if you don’t have a Conceptual Instrument?
- DDI has said that it doesn’t have required instruments.
- And implemented instrument comes from a conceptual instrument, so 0..1
In-Person Meeting (London EDDI) - 12/2/14
Attendees: Barry, Jannik, Hilde, Guillaume, Jon
Everyone is generally OK with the model. There is a lot of working around the edges.
MODE: Display. Extensions to the model - these issues needs to be addressed. Throw this in the parking lot? But Jannik makes the point that such issues of Display should be in the ImplementedInstrument.
Going forward: We will use the DDI controlled vocabulary for Mode of Data Collection to determine which use cases we throw at the DataCapture model. We should also probably add modes to this CV since it doesn't quite cover every type of data capture. http://www.ddialliance.org/Specification/DDI-CV/ModeOfCollection_1.0.html
- We also need to determine some type of form or define the fields or protocols that we will use to apply use cases, so that the application of use cases is standardized.
- Also, Jannik created a new model that clarifies the relationship between ExternalAid, Instruction and Capture.
Finally, to re-iterate, there are no major objections to the DataCapture model. We continue to tinker with minor things. We need to move this to the next level, and applying use cases at this point would probably continue to prompt additional minor tinkering. There was consensus that we should consider most of our work done and send the model along to the modelers.
GTM Meeting - 9/30/14
Attendees: Barry, Jannik, Hilde, Steve, Jon, Guillaume
Agenda: Review model developed out of Dagstuhl. Propose and discuss modifications. Decide what is further required to send to modelers.
1. Hilde: Is process step relevant to PAPI mode? Is DataCapture model currently biased towards electronic modes of capture?
This begs the question about MODE, which in turn leads to discussion of Layout.
- How to render a PAPI layout in DDI?
- INSEE uses ExternalAid to point to XSLT to determine layout and sequence for such modes. Wendy says it is a workaround. ExternalAid as a way to help Respondent understand, interpret, and complete the instrument.
- CAI programs have such layout built into their software - does the layout in CAI systems (on the screen).
- "Simple" PAPI instruments not so simple? Clearly, DISPLAY (aka Form or Representation) is becoming a critical component.
- Typography (bolding, emphasizing text), Color and other Perceptual elements (a scale represented as a 1-inch horizontal line on which R's indicate their "distance"), Layout.
2. Hilde: in Question, remove synonym and reference to QuestionItem in DDI3.x to make it more flexible and reusable.
- A related item: some questions include some intro statement that can be considered part of the question. Revisiting Statement vs. Question - the distinction won't always be clear. We need to define, provide examples, include synonyms and make things as unambiguous as possible to give guidance on using objects appropriately, but we won't be able to control people's users behavior. Some will invoke different DDI elements in ways that weren't intended. We can only do our due diligence to dissuade such behavior.
- RE: statement v. question, Steve suggests applying known use cases and see how many different ways we define statement.
Phone call with Jannik, Barry.
Cleaning up the model before releasing to the group.
- Jannik proposes linking ExternalAid and Instruction directly to Capture instead of thru InstrumentComponent.
- The value of InstrumentComponent is to have one object that is related to Process - and InstrumentComponent can be a statement OR a capture.
- Jannik will also explore the 'touch-points' DataCapture has with other models.
- Protocol and State (which are great and appealing concepts) are not in the current model, but are likely to be in the future.
Going Forward: Set up a GTM with larger group next week and gain consensus on the model.
Phone call with Jannik, Barry.
Confirmed that Statement is readable text, distinct from ExternalAid
Could Instruction have a similar aggregation relationship to Instrument Component as ExternalAid does? I leave this to Jannik and the modelers to decide if they can point to a relevant use case.
Confirmed that Instance Variable in Data Description has a relationship to Capture.
Outstanding issues:
In a manner similar to how DataCatpure uses the Process model, using Group (aka Collection) to model longitudinal use cases? This is the approach that Sophia brought up a few months ago. Jannik thinks: in a longitudinal data capture, we could have a Concept which would be version and instantiated as a Capture (Question) over time.
Jannik discussed one of our use cases; the same Concept being instantiated differently depending on the mode of administration (in a survey). Somewhat similar to a longitudinal use case, Capture could have a nested relationship with itself. A web question and a text question: currently these 2 captures only have a relationship via Concept? We're going to leave the model as is: different Captures are related via Concept.
State and Protocol: when these are eventually included in the model, they would probably be situated in Capture. Protocol should be a small model in itself because you need to invoke the Process model to make Protocol work. State is nearly synonymous with paradata, as Jay says. It too could be a small model in itself. Are State and Protocol used anywhere else in DDI 4?
Is ResponseDomain the same object (one of the Domains) in LogicalDataDescription? It is tedious to define the response domain twice and if it actually is living in LogicalDataDescription, then that's what DataCapture should point to. Jannik will find out from the modelers.
Regarding the design principle of only one way to accomplish one thing; the idea of QuestionBlocks is revisited. If you have a set of questions designed to be administered via a block/sequence, as a whole these are a Concept (e.g., CESD scale). These would be implemented via Process; does Process need to have a relation to Concept? Or are these blocks considered fodder for Group or Collection?
Going forward:
Check if Statement, Instruction, ExternalAid belong or live in other models?
ResponseDomain redundant with something in LogicalDataDescription?
Tighten up the properties, descriptions of each object.
Schedule another talk for next week, and prepare to update the group.
What to do with Instructions?? Does it relate to Statement or Act? It is an object which related to Act.
Question has a property of text.
Use cases
Survey example - Dagstuhl evaluation questionnaire
Note need to make sure our definitions are tight enough to give guidance to users on how to use objects appropriately
1 Simple Survey
~~~~~~~~~~~~~~~~
* LogicalInstrument
+ Sequence (top level, type of ControlConstruct)
- Statement (type of Act)
Dagstuhl would like your feedback to (1) provide feedback to
organizers and participants, (2) improve our organization, and (3) for
reports to our funding agencies. We will send you an email with
seminar-related results of this survey.
Feedback is shared only in aggregated form that cannot be used to
identify the origin of a statement or grade -- we guarantee your
anonymity! However, if you rather not trust us, we are still very much
interested in your feedback. We would be glad to receive an email from
you at survey@dagstuhl.de where you describe your experience in your own
words.
- Instruction (type of Act)
Some questions might not apply to this seminar or your personal
situation. Please skip any question that does not apply. Otherwise,
please select the most appropriate answer.
- Sequence: Trends and Changes (type of ControlConstruct)
* Question (type of Capture)
- QuestionText: Do you see trends or changes, at Schloss
Dagstuhl or elsewhere, that we should resist or reverse? If so, what are
they?
- ResponseDomain: Text
* Question (type of Capture)
- QuestionText: Do you see trends or changes that we should
pursue to ensure Schloss Dagstuhl's relevance?
- ResponseDomain: Text
- Sequence: Personal Questions
* Question (type of Capture)
- QuestionText: Please rate your professional seniority
- ResponseDomain: Code
- Junior
- Senior
* Question (type of Capture)
- QuestionText: What is your primary occupation?
- ResponseDomain: Code
-
* Question (type of Capture)
- How many hours did it take you to get to the Schloss?
- ResponseDomain: Numeric
* Question (type of Capture)
- QuestionText: What is your country of residence?
- ResponseDomain: Code
Parking lot - Question Grid and Question Block related to Capture
Jay showed the process model to the group to begin thinking about how process model links in with instrument model.
We looked at simple process in Drupal. In 3.2 there is something called instrument control. In the process model it was called Act.
Jay: What happens to the instrument control? Do you pull in the control construct set of objects?
Logical instrument is the design of the data capture. It should point to the process model. It needs to be nailed down a bit. more. Would assume label, description etc in included. The GSIM questionnaire specification had useful text. This has a relationship with control construct (0..n)
Control Construct has subtype of Act and Capture is a subtype of Act. Act has a relationship to Statement. Statement is explanatory text etc. Need to set the scope
Capture has question and measure. Question is survey specific, Measurement is the more generic
Capture has response domain.
Instrument control defines how to administer the questionnaire. The sequencing and loops reside in process (in control construct). Use elements of process when it is needed. But can this handle the complicated use cases - example of blood pressure cuff. THIS IS CONTROL CONSTRUCT
Value Domain links to Response Domain
Parking lot - relationship to Conceptual objects, Question block
Next step is to start putting examples against this version of model.
GTM Meeting - 9/30/14
Attendees: Barry, Hilde, Jannik, Sophia, Steve, Jon, Guillaume
Agenda: Reviewing issues raised by Guillaume and Hilde
(1) Represented question. In the simple model now, the distinction between represented and instance isn't needed? Yet if, when extended to the complex, if this causes problem then we should include it. According to Jannik, there is nothing in the current Simple Instrument model (using Views and Packages) that prohibits reusability. However, this has not been clarified by the overall modelling group.
(2) Question Grids. Grids are basically a way of describing a table and cells and their location in a self-administered or face-to-face mode. E.g., a question is a 1x1 grid. Included but should be in ComplexInstrument.
(3) Redefine Statement to include capture instructions. Done.
(3a) External aids can be referenced by either InstrumentControl and Question. Jannik says being able to do this 2 different ways, is a problem. There are 2 ways to do this because of the need to be reusable in both InstrumentControl and Question. Again the issue of reusability is determinant. Table for the overall modelling group.
(4) RepresentedQuestion/ConceptualVariable. A design pattern (at a higher level of abstraction) trumps the need to include language, time, etc. at the lower (simple level). The design pattern is referring to Sophia's point that there are 4 properties of Groups (Panel, Language, Geography, Time). See the principle in (1) that Jannik made about modelling principles, in which characteristics that are shared across Views or other Simple Objects should be modeled at a higher level. This is the same issue as reusability; is this something that has to be dealt with at a lower granular level. Table for the overall modeling group.
(Other issues from Hilde's email)
Arrows in drupal are in wrong direction - yes, we should use another tool to develop the diagram.
Update: the diamonds which indicate aggregation are still being rendered incorrectly.
Jannik is aware and trying to fix.
ResponseDomain terminology is very centric to question-response environment - consider changing.
Process model is all the if statements, loops, repeats, etc. Do we keep Statement and InstrumentControl in Instrument or rely on it being in the Process model? This is another issue that we should Table for the overall modeling group.
Responses From Modeling Group:
Modeling questions raised in Simple Instrument:
1 - if we are going to do one thing only - there should only be one way to do any one thing
2 - grouping will be available in the next 2 weeks
3 - duplication should be handled by the modelers
4 - statements in questionnaire
Statements, Instructions, Related materials are these the same or different and how they fit into a process model and possibly question.
Followup Phone call with Jannik about Modeling Group responses:
Instrument can own statement. Question can hold statements - then InstrumentControl would have Questions and ControlConstruct and Capture.
Relatedly, our current InstrumentControl only controls the process; we are currently conflating capture instructions with those functions (skips, flow logic, etc.) that are part of ControlConstruct (in Process Model). Fixed the examples and descriptions of InstrumentControl in Drupal.
Human readable instructions for administering the instrument should reside in Statement, which includes human readable capture instructions.
InstrumentControl then defines how the instrument or data capture tool is administered, just like we define it now.
Sequence and flow is the domain of the Process model.
from Jannik's doodling:
external aid, interviewerInstructionReference, externalInterviewerInstruction (these live in QuestionItemType), statement
in 3.2 these are all essentially the same properties found in QuestionItemType
Statement will contain all of these 'types' - i.e. we will keep Statement as an object with different types.
Statement will become a SuperClass.
Does Statement have a relationship to the Process model? If Statements needs to be sequenced and routed along with questions text, e.g., then yes it needs to be related to ControlConstruct.
In/OutParameters (also in QuestionItemType) should be addressed, but only in the complex Instrument (example would be stored item in survey, i.e. a persons' name, that is referenced in specific question text.
Our discussion led to further (perhaps only a reiterated) definition of our task: to define the elements in Instrument. How they are used is the domain of the Process model.
This is relevant to the Grouping (Geography, Language, Time, Panel) - Grouping defines how elements are used. But we'll see what the modeling group comes up with vis-a-vis Group in a few weeks.
Yet to do for Dagstuhl: further define InstrumentControl, ResponseDomain, and LogicalInstrument
Revised Simple Instrument Objectives:
Simple instrument objectives - Working document 20140909.docx
GTM Meeting - 9/9/14
Attendees: Barry, Jon, Sophia
Agenda:
(1) Continue discussion of how to use Hilde and Sohpia's document to describe the objects in simple and complex instruments.
We don't have a quorum, but need to make progress. There are a few characteristics to resolve in Hilde/Sophia's document; should these be hashed out at Dagstuhl.
(2) Individually grind through the objects in our current diagram and resolve any controversies or disagreements.
Flesh out Drupal objects, roll document characteristics into Drupal, more use cases, examples, descriptions.
Need modelers input; modeler needs to be part of our conversation as we think about these different objects, their definitions and how they related to each other
That said, we think we are nearly ready to hand this off to a modeler for input - we have had modeler types already opine about this (Sophia, Guillaume, Brigitte)
How many of the objects are solid; of course there will be quibbling about minor issues, but we think the model is robust; let's see if it can be broken.
Sophia: could benefit from modeller - our instrument is small part of larger picture.
DDI has 800+ objects - are other 'somethings' (datasets, etc.) being modeled similarly? Are those groups using similar processes?
Revised Simple Instrument Objectives:
Simple instrument objectives - Working document 20140827.docx
GTM Meeting - 8/27/14
Attendees: Barry, Sophia, Steve, Jon
Agenda:
(1) Review Hilde and Sohpia's document detailing simple and complex characteristics.
Does this document help to define our task? If not, how can it be improved?
Yes, it is helpful by arraying both simple and complex side by side.
Review Achim's email. Does this help define our task?
Yes, we take a step back.
Perhaps "Simple" can be further defined by what it is NOT?
Sophia: if we have problems in common (simple vs. complex) perhaps someone can come up with guidelines.
We need to discuss/review the document with the whole group.
Barry: Simple vs. Complex are not explicit easily-defined distinct categories - better to think of them as ends of a continuum.
How should we use the document?
Jon: we have to decide where the line is, knowing there is some margin of error.
The boundary between simple/complex is informed by the knowledge that we have to extend the model at some point.
Steve: the distinction has hindered progress.
Need to know what the complex case is before defining simple; we should almost do this backward modelling from complex to simple, drawing a line around the simple case.
Data description group is doing something like this.
(2) Determine what else our object model needs in order to send to modeller (Jannik).
Walk thru each object and relationship one-at-a-time.
Walking thru Hilde and Sophias' document
Determine which objects which are hindering progress, and either:
resolve to everyones satisfaction, or
assign to Parking Lot
Action Items:
(1) Add In/Out column to Hilde/Sophia's document and distribute to the group for discussion.
(2) Explain our philosophy that we really need to start with the complex and then subset the simple case.
(3) Determine pain points in the current model figure - go through each one-by-one and try to resolve those pain points.
(4) Schedule another GTM meeting for 2 weeks, same time. There is no time that isn't inconvenient for someone.
Hi Group! Here is the document from Hilde and Sophia:
Simple instrument objectives - Working document 2014_06_26.docx
GTM Meeting - 6/19/14
Attendees: Barry, Sophia, Hilde,
Agenda: Discuss the definition of simple vs. complex instrument; Review Hilde and Sohpia's document detailing simple and complex characteristics.
The document is only to describe the characteristics of a simple instrument - modelling will occurr after the group has had a chance to review this document.
Approaching this task as bottom-up, not top-down. Domain-specific experts describe the problem, then modelers create the software/standard. The simple instrument implies a questionnaire or survey, but this should not hinder or limit the extension of simple to more complex and non-survey instrument models.
Barry will send annotated document back to Sophia and Hilde who will modify it once more before distributing it to the wider group.
GTM Meeting (Toronto) - ??/??/14
Attendees:
News:
(1) Simple Instrument modelling is tentatively scheduled to be completed Sept., 2014.
(2) "Content" which is what we are modelling (i.e. Simple Instrument) is also known in the Moving Forward project as a "Functional View".
Goals:
(1) Update from Sophia and Hilde: Description of our expected task, distinguishing between Simple Instrument use case and Complex Instrument; update our description in Drupal and Wiki.
Is SimpleInstrument really meant to describe a simple survey, and ComplexInstrument meant to describe nearly everything else? Or do simple and complex exist as opposite ends of a continuum that is defined by the amount of InstrumentControl that is required to field or describe the instrument?
One of the modelling principles assumed by the Moving Forward project is that elements in a simple object will be present in the complex object. This could pose a problem in the future, i.e. a complex data capture such as biomarker collection may not use Question per se.
(2) How to update the Drupal diagram.
Brigitte and Olof helped fix Statement-InstrumentControl and Capture-Response domain relationships. Doing this we found that Drupal is not accurately rendering relationships (arrows and diamonds are on the wrong ends), based on the specification and description of the relationship.
Relatedly, should Measure have the same relationships to other elements as Question does? if so, why distinguish between the two; why not just have another superclass that combines them?
GTM Meeting - 5/16/14
Attendees:
Barry (meeting minutes), Achim, Sophia, Guillaume, Hilde
Goals of this meeting:
(1) How best to use our Wiki space?
Is viewable by anyone who knows the URL.
All information (conversation) is documented and does not get lost.
Organize by topic; datestamp and initial contributions;
(2) What technique is the best way for us to model Content? Are UML diagrams appropriate at this point, or should we employ a less technical approach (use simple text, flow chart, etc.)?
- draw freehand diagrams to begin discussion; when we draw an arrow we add text to explain the arrow or relationship. See how the text in example below describes the relationships.
- group members become literate in UML (see useful UML quick reference from Thomas Bosch).
(3) Review and discuss our task - to what extent do we consider ComplexInstrument characteristics?
Describe our task: to describe a simple survey instrument (need to define: what is simple survey instrument compared to complex survey instrument?)
Develop a robust model that is comprised of core elements that ably describe a simple survey instrument, but which doesn't break down when applied or extended to more complex situations.
"Capture" is an abstract class.
Diagram is incorrect:
Statement should have a different relationship to InstrumentControl (InstrumentControl uses Statement)
Also, Capture uses ResponseDomain - need to change the direction of the arrow.
Include a new Instrument class called "other data sources?" Keep in mind that Measurement was originally intended as a 'placeholder' for more complex data capture contexts. For now, we will table (or put in the parking lot) Measurement and other complex data captures, and concentrate on a truly simple survey instrument, but Measurement may have very similar relationships to other elements and end up acting much like Question.
Homework:
(1) Define our task! Hilde and Sophia work together to define characteristics which distinguish simple from complex instruments.
(2) Sophia and Guillaume determine what elements should be (temporarily) assigned to the Parking Lot and determine how to create folders to store such things (there was more to this, but I can't recall).
(2) Barry will figure out how to log into Drupal and clean up incorrect relationships.
(3) All will try to schedule another GTM during the Toronto sprint. Barry will send a Doodle poll to non-Sprinters and talk to Therese about a venue at Toronoto.
Diagram as of 10/3/14
Expanded view of SimpleInstrument and how it relates to other foundational objects.