Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

...

...

Discovery - Google Dataset Search will become go-to, but is (currently) very shallow

  • little or no content from stats agencies ... because it is not (yet) visible in the catalogs harvested by Google
    • licensing/privacy constraints ...  
      • metadata must support access-restrictions - dct:accessRightsdct:license
    • some from 3rd parties, VARs, or secondary sources (e.g. unemployment data from biodiversity service!)
  • standardisation of thematic/semantic content
    • (In principle) is supported by e.g. dcat:theme, sosa:observedProperty, sosa:hasFeatureOfInterest, sosa:usedProcedure, sosa:madeBySensor
    • supported by controlled vocabularies/registers

API access

  • Search record direct links to a landing page, no direct connection to data
    • conventions for links from landing page to data
      • maybe add (dcat:)accessURL, (dcat:)downloadURL to link-relations registry ...  how are these related to describes ? 
  • information about format is rare (let alone schema!)
    • (In principle) is supported by dct:type ... dct:conformsTo ... dcat:mediaType dcat:endPointDescription but not yet used broadly
    standardisation of thematic/semantic content
    • (In principle) is supported by e.g. dcat:theme, sosa:observedProperty, sosa:hasFeatureOfInterest, sosa:usedProcedure, sosa:madeBySensor
    • supported by controlled vocabularies/registers

"Are these datasets broadly compatible & relevant?"

  • mechanics for cross-domain data harmonization?
  • most/best information is usually in the dataset abstract i.e. text
    • but ... abstract is always written for a specific audience
    mechanics for cross-domain data harmonization?
  • are these datasets really describing the same thing? 
    • links to standard terminology - controlled vocabularies

...