1st Oct - Collective notes

Intros

  • Michelle Edwards - keep y'all on track and get creative outputs
  • Simon Cox - Australian applied science. interest: cross domain data and meta-data standards. working of DCAT with W3C Data Exchange. join stats, social science, health domains.
  • Simon Hodson - studied philosophy, french history, ... digital humanities. for common thread, my background is all about definitions.
  • Steve McEachern - co-lead ... chair of DDI aliiance exec comittee and @@1 Australian social science data archives. cross domain: some health, environmental. led us to convergance. will expound in goals of workshop
  • Arofan Gregory - consultant - standards in international trade statistics and SDMX-QB. DDI modernization. 
  • Harold Solbrig - Johns Hopkinds University after 20 years with Mayo Clinic - representing FHIR. worked on standardsd like 11179, OWL. interested in statistics in research. interested in RDF representations of healthcare data for secondary use.
  • Peter Winstanley - government statistics, use of metadata to promote interop between state administrations, transparency, use of public-sector info as resource for business development [[unmerged: scottish govt. clinical microbiologist. classification, machine learning. worked as advisor on social care. worked on uses of metadata to promote interop between state administrations for EC. our gov't requirements are very broad (transparency, etc), but increasingly working on data for business development. automating pipelines to make them slick so energy can go into profits. work together to reduce friction. work in W3C Data Exchange WG. my interests are in DCAT, AP application profiles , conneg if there are multiple APs]]
  • Adrian Burton - Australian research commons ANDS (ARDC) - National research infrastructure - cross-domain, catalogue, identifiers, vocab services - linguistics - cross-cultural communication
  • Steve Richard - adjunct faculty Columbia University - geologist - working with datat catalogs for earth cube. workined on 115-3 update. Metadata catalogues, ISO 19115-3 editor (XML Geospatial metadata), Geoscience information interop - model + vocabularies. want an RDF vocab to harmonize
  • Rebecca Koskela - loosely associated with University of New Mexico. executive director of DataONE + co-lead on RDA metadata activities, representing CERIF - background bioinformatics and high-perf computing. looking foward to pilot pojects
  • Alejandra Gonzalez Beltran - Oxford University - developing models and software tools to support them. working with NIH big-data-to-knowldege. models/ontologies/vocabularies for NIH data→ knowledge initiative, DATS. working in W3C Data Exchange on DCAT, will provide a DCAT view of DATS.
  • Ray Plante - NIST - background: astronomy; worked in Virtual Observatory data grid. working at NIST on many domains. interested in how you can practically combine data from different standards into a single application and how to evolve standards without breaking stuff.
  • Phil Archer - *used* to work for W3C, now GS1 (Barcodes) - semantic web standards (standardization of data *except* XML) - one answer for everything: 'use the web' ... the way it was intended and designed (not just for throwing PDFs around) - loooking for step-change in the way we do metadata collection - hand-crafted does not scale
  • Armin Haller - computer science/management, working cross domain risk management and business transformation. teaching students how to deal with data. was in W3C Web Services stack. worked on ontologies, e.g. [SSN](https://www.w3.org/TR/vocab-ssn/). observicing DCAT progress. interest: bringing data onto the web and enabling sensor devices. don't want google to dictate the vocab everyone uses.
  • Wendy Thomas - U Minnesota - Minneapolis population center - analysis of ... helping capture of provenance of data items and study-level data 'cause they've been focused just on data. been at DDI since inception. focus on aggregate/descriptive data with geographic context, how plug in to accurately exchange metadata within and across standards
  • Fernando Gouvea Reis - Infectious Disease Data Observatory, University of Oxford, background: medical sciences, pharmaceutical and @@@ health - infection disease database/system - put together clinical, lab, epidemiological data from around the world - to make open and available - pilot in infectious diseases. want y'all help make our data sets more interoperable
  • Virginia Murray - Public Health England - medical doctor - use of data to help UN members states deliver data - as medical doctor, head of global disaster risk reduction - how to use data to deliver to this huge target - using DRR Open Data Newsletter to reach data scientists around the world. problems with definitions, how we share them and how we learn from each other.
  • Dan Gillman - US Bureau of Labor Statistics - metadata things for a long time. involved with DDI modernization efforts. worked on DDI since inception in 95. devloper and chair of 11179. UN Statistical Program liaison with official national statistical offices, concerned with definitions and concepts. at BLS, using DDI to document one of our complex surveys from beginning to end. at BLS, developing glossary of our technical terms (terms and concepts), in particular to support fancy queries for folks who don't understand the
  • Bill Michener - University of New Mexico - started as Librarian/oceanographer/landscape ecologist - working on tools in env/eco science arena through DataONE, SEEK - Kepler, DMPT, DataONE federated infrastructure - 3 new people, 3 acronyms, 3 principles that we can tell people about.
  • Eric Prud'hommeaux - W3C - where is RDF useful? plumbing vs science - keep pushing integration clinical with other data - 
  • Larry Hoyle - institute for policy social research at University of Kansas - thought I was trained as a social psychologist but moved to data sciences - what can we do to help individual researchers to capture metadata? What is happening outside of DDI?
  • Jay Greenfield - volunteer with DDI alliance - cross domain use between social sciences and health sciences - integrating data network in Southern Africa, HIV/AIDS data - modernisation of official statistics - was at Booze Allen Hamilton and helped rescue Obamacare website, natl children's study, caBIG. data discover over telephone data - big data with NSA
  • Joachim Wackerow - GESIS social sciences libraries - Social science + programming training - help at interface between social science and data science - DDI Alliance - looking for output with impact - look for 3 document outputs

Workshop Goals

Steve McEachern

Workshop program will evolve/respond to daily discoveries

Analyse data requirements of pilots - where have I seen these requirements before?

Understand overlaps; Make explicit the implicit vocabularies that we already use;

  • how to go into details - fine-grained ideas, variables; how do vocabularies this facilitate interoperability
  • how to go from domain-specific to domain-independent

Proposed outputs: 

  • Background paper - description of workshop and pilots
    • Data requirements of pilots - infectious diseases, disaster risk
  • DCAT profiles - is DCAT a rallying point for use of x-domain standards in domain-specific context?
    • DDI
    • SSN
  • Use of PROV (?) to describe data transformation

Study level (dataset, project) vs experiment level

  • how to describe the content of a cell? - level below dataset 

What are common formats? How are they described in different metadata standards

How do we describe transformations? How do we move through progressive phases of data transformation

Steve Richard - Need to focus on a couple of specific data problems, Simon Hodson - from pilots, from deep-dive?

Virginia - only a single representative of big community - need papers to bust acronyms, and to signpost the resources available

Metadata is middleware/infrastructure - must not be intrusive

CODATA Pilots

Simon Hodson

Background to CODATA, ISC, ICSU, Data Integration Initiative, Pilot projects

Fernando Gouvea Reis

Tabulations of infectious disease data requirements

e.g. vectors → spatial distribution of mosquitos/camels

10% of medicines are falsified/sub-standard in LMICs

Challenges:

  • file formats
  • lack of standards
  • lack of data dictionaries
  • where are the datasets
  • restricted access to datasets

Phil Archer - Could UN have a data access policy - "open-by-default"

Virginia - World Data Forum meeting this month - ask the question

Virginia Murray

Disasters are the primary reason that governments fall

UN has had explicit programs in Disasters since 1990s

National risk registers

What do we mean by 'death' and 'mortality' ?

How many countries register births and deaths? not all ... most Africa not


Afternoon Session Part Two

CERIF

Used far more in Europe than in US.  Maintained by Eurocris.  

CERIF : https://www.eurocris.org/cerif/main-features-cerif ; see the RDA-DCC entry at http://rd-alliance.github.io/metadata-directory/standards/cerif.html


DCAT

Presentation https://goo.gl/isKxxf 

Notes: https://goo.gl/qZ9LYb

Comparison of 2014 and proposed version 2: http://www.semantechs.co.uk/viz_models/viewer/


metadataApproaches.dot is a base file with entries for the metadata approaches to be used to add more information


SSN

Semantic Sensor Network Ontology https://www.w3.org/TR/vocab-ssn/


FHIR

Request from Harold for further 5 minutes to make more targeted comments.


Harold: FHIR

Alejandra: FAIRsharing



Summary Image (incomplete)

metadataApproaches.v2.dot is the dot file







  File Modified

JPEG File 20181001_102537_resized.jpg

Oct 01, 2018 by phil@philarcher.org

JPEG File 20181001_102554_resized.jpg

Oct 01, 2018 by phil@philarcher.org

JPEG File 20181001_102604_resized.jpg

Oct 01, 2018 by phil@philarcher.org

JPEG File 20181001_102619_resized.jpg

Oct 01, 2018 by phil@philarcher.org

Microsoft Word 97 Template metadataApproaches.dot a base DOT file for metadata approaches ... to add arrows to

Oct 01, 2018 by pedro.win.stan

Microsoft Word 97 Template metadataApproaches.v2.dot dot file of some of the metadata approaches info

Oct 01, 2018 by pedro.win.stan

PNG File metadataApproaches.v2.png circo image of metadataApproaches.v2.dot

Oct 01, 2018 by pedro.win.stan