Notes from the discussion about the resilient cities pilot discussion 

Present: Philipp, SteveM, SteveR, ericP, Bill, PhilA, PeterW

What is the goal of the data integration. what are questions.  Model exists already.

What is impact of change x to transportation system on infant mortality/pulmonary disease

resilience-- how will city respond to bridge collapse, power station failure, bus driver strike

Model focus is planning. Scenario exploration; (predicting effects of future decsions)

Problem of correlation not necessarily indicating causality. (trying to establish effects of actual decisions)

Data survey from

Data specification description of spatial data is insufficient to understand how it works

Many of the datasets have lat long, 

Travel inventory has locations by city, community, and neighborhood (but many trips don't have neighborhood). There is also an identifier for household, but couldn't tell if households are located. Data also include whether trips on food, by bus, by car. Spatial resolution possible by household, possible by barrio (neighborhood), reliable by community.  Need centroids for the town, district and neighborhood designations.

hospitalization diagnosis  data is aggregated by year, need better resolution; air quality data from sensors is reported by hour.

need higher temporal resolution on hospitalization to correlate with air quality data.

In general data in the inventory appears well curated.

Afternoon session

Data we've looked at generally contains a geography.

Situation is as simple as it gets: single city/administration so there is already some harmonisation.

The data is basically fine - could do with some enhancement, connecting to metadata, identifying code lists etc.

Some linkages were apparent. We should think about generalising to the other pilots.

We should offer some steps that would make things easier. It took a while to notice that the hospital data was yearly, a very different granularity from the sensor data. Discovery metadata and variable metadata should include that.

We need datatypes for geospatial, the existence of geospatial. Some data had by name, some by lat/long etc. Need a gazeteer.

Household travel - data had household identifiers, could possibly add geo info.

Need to be able to say run this program that has this key in it that can then make a query behind the hospital firewall.

Is this scalable?

{SteveM is making notes separately}

Note-- W3C spatial data on the Web best practices (a spcialisation of Data on the Web Best Practices) has important recommendations that can inform spatial data reporting needed for integrating spatial data.

Looking forward:

get together with people of  medillin to assess the cost of what they have done in the data aggregation/collection.  

What  would it cost to achieve more granular data for a higher resolution model?