General ideas for "what next"?
Initial work: what has happened so far?
What did Medellin (and IDDO) do to bring together the data resources that they have?
Project review, analysis of activity in Medellin and IDDO?
Next steps:
What's the delta between what they did and what we would like them to do?
How can we minimise the effort required?
What tools would help? (e.g. resolution service? validation tools?)
IS this a one-off? Could this be repeatable?
Is this coming from an existing infrastructure?
Could you build a service that could enable this "effort minimisation"?
Building data and metadata?
How do we address the metadata gap?
File inspection
- Look at the data - profiling
- Can you identify what the column is likely to be?
- How can you enable this to be confirmed (human checking)
- How do you feed that back into the ML algorithm?
What then is the story for how you demonstrate value? What is the cost/benefit of this?
How would you originate the data in the format you need? Or could you at least enable this transformation to be automated?
(e.g. OpenRefine for processing content - enabling the process to be repeated)
Social media data - how to enable extraction and usability of content for city planning purposes?
Interacting with other data sources?
Sustainability
Cost benefit analysis: Why should they do this? What is the result of not doing this? (Return on investment)
What would be required to maintain this?
What is the institutional infrastructure that is required?
What is the long term demand across projects or institutions?
What does it take to sustain this? (e.g. What happened to Accra?? What has happened so far in Medellin?)
Could you replicate this somewhere else? (E.g. Peter, could you find the relevant data sources in Stirling??) Could you attract other sectors)
How would you fund this to enable sustainability?
What are the policy goals?
Who are the potential funders? (research agencies? governments? ...)
What is the business case you would make to these funders?