Primary and Reference Data
Wednesday progress report
From Darren Bell
The TLDR context of this is that we have built a prototype data-subsetting and linkage tool that uses DDI-CDI Data Description Package as its logical data model and is implemented on a cloud NoSQL database (Amazon DynamoDB) with React.js and a simple serverless architecture. We had intended in this prototype to use Smart Energy data as exemplar “long” data but in the end, did not have sufficient time to negotiate the necessary licences with the various governance boards. Instead, for our prototype, we have used a combination of social science and environmental “Wide” data and synthesised time diary “Long” data. The core goal of demonstrating wide and long real-time linkage and subsetting in a single platform are still well exemplified by this arrangement and moreover, we still intend to integrate Smart Meter data in the next couple of iterations of the tool
The attached Word document is written by one of our data enginners Tom Gilders and gives an overview of the “data product builder” (as we call it) with some detail about datasets, query patterns and use of CDI and Data Privacy Vocabulary (DPV). There are links and references as well within the document.
For some less technical context, below is a whitepaper/manifesto/pamphlet from 2021 written by Darren Bell (UK Data Service ) and Jon Johnson (CLOSER, the home of longitudinal research ) which outlines some of our thinking about the practical problems around interoperability, particularly in the Social Sciences domain.
For additional context and reference, below is a UK Data Service presentation from IASSIST2022, which enumerates many of the points from the above document and references the “Data Product Builder”.
Files for the presentation in the workshop:
From Hilde Orten
The EOSC Future WP 6.3, Science Project 9 “Climate-neutral and Smart Cities” wants to combine resources from two Science Clusters. The European Social Survey (ESS) collects data related to political and social trust, health and health inequality, attitudes towards climate change and energy, understandings and evaluations of democracy and digital communication at work and with family, amongst many other topics related to the smart agenda. The purpose of this project is to facilitate the task of adding environmental data for a selection of countries and regions to complement the ESS survey data. For this purpose the ESS will upgrade its former ESS Multi-Level Application, and also make data available from the EOSC Platform.
Below is a presentation held at the IASSIST conference in June 2022 about the project, with some additions for Dagstuhl 2022 that regards challenges related to integration data from various data sources for analyses.
From Franck Cotton
Interstat project - https://cef-interstat.eu/
Interoperability between NGSI-LD and statistical data models
SDMX / NGSI-LD (or more precisely Data Cube / NGSI-LD) converter
Development is going on
Code is published on GitHub
Detailed documentation is available
Additional test cases have been defined
Next development is sprint scheduled 3-5 October in Palermo
DDI-CDI / NGSI-LD interoperability
Formalization of running example improved since Dagstuhl 2021
DDI-CDI test cases still rudimentary
Development not started
Luis Gonzalez, UNdata Portal Modernisation