Travel inventory has locations by city, community, and neighborhood (but many trips don't have neighborhood). There is also an identifier for household, but couldn't tell if households are located. Data also include whether trips on food, by bus, by car. Spatial resolution possible by household, possible by barrio (neighborhood), reliable by community. Need centroids for the town, district and neighborhood designations.
hospitalization diagnosis data is aggregated by year, need better resolution; air quality data from sensors is reported by hour.
need higher temporal resolution on hospitalization to correlate with air quality data.
In general data in the inventory appears well curated.
Data we've looked at generally contains a geography.
Situation is as simple as it gets: single city/administration so there is already some harmonisation.
The data is basically fine - could do with some enhancement, connecting to metadata, identifying code lists etc.
Some linkages were apparent. We should think about generalising to the other pilots.
We should offer some steps that would make things easier. It took a while to notice that the hospital data was yearly, a very different granularity from the sensor data. Discovery metadata and variable metadata should include that.
We need datatypes for geospatial, the existence of geospatial. Some data had by name, some by lat/long etc. Need a gazeteer.
Household travel - data had household identifiers, could possibly add geo info.
Need to be able to say run this program that has this key in it that can then make a query behind the hospital firewall.