Identification Section

The Global Health Observatory (GHO) is "WHO's gateway to health-related statistics for more than 1000 indicators ... organized to monitor progress towards the Sustainable Development Goals (SDGs)".  The GHO supports a queryable database with a URI based data query api http://apps.who.int/gho/data/node.resources.api and metadata description http://apps.who.int/gho/athena/ http://apps.who.int/gho/data/node.metadata . Additional metadata and other information can be found at http://converters.eionet.europa.eu/schemas , although the link to what appears to be an XML Schema  http://converters.eionet.europa.eu/schemas/620 is broken at this point in time.  It appears that GHO information takes a while to appear.  As an example, there is a fairly comprehensive set of information about the 2016 Ebola outbreak (http://apps.who.int/gho/data/node.ebola-siterep), but there are no references to the 2018 outbreak [[cite]] at the time this document was being developed.

The WHO Emergencies preparedness, response provides a variety of timely information about diseases, alert and response operations and disease outbreak news [[http://www.who.int/csr]].  The disease outbreak news (DON) is published as web pages but it is obvious that some of the content of these pages are derived from structured data sources.  Example: [[http://www.who.int/csr/don/27-september-2018-ebola-drc/en/]].  The WHO DON content is available as an RSS feed.

ProMED, the Program for Monitoring Emerging Diseases, is an "Internet-based reporting system dedicated to rapid global dissemination of information on outbreaks of infectious diseases" https://www.promedmail.org/aboutus/.  ProMED mail aggregates a variety of sources, sending a summary and associated links.  As example, an Ebola post(Figure out how to reference the post "Ebola update (103): Congo DR (NK,IT) cases, risk, response, research") aggregates a collection of information from a variety of sources.  The information is translated into the language of the intended reader and includes links to the original sources.  The format and structure mailings themselves require human readers to extract and format the relevant information, there is obviously an automated system underlying these mailings and, were it possible to query this system directly, it might be possible to identify and extract references to sources (e.g. the WHO GHO and CRS systems descried above) in an automated fashion.

The Global Biodiversity Information Facility (GBIF) [[https://www.gbif.org/]] is a potential source of species and vector information.  GBIF data is available as a RESTful JSON API [[https://www.gbif.org/developer/summary]]. The majority of the GBIF datasets are published using the Darwin Core Archive Format (DwC-A) with the metadata represented in the Ecological Metadata Language (EML) standard [[https://knb.ecoinformatics.org/external//emlparser/docs/index.html]].  As an example, a search for Hypsignathus monstrosus, a species of fruit bat suspected of being an ebola carrier [[https://www.gbif.org/occurrence/search?taxon_key=2432958]] yields a 2018 sighting in Gambia. 

Wikidata [[https://www.wikidata.org]] "acts as a central storage for the structured data of Wikipedia, Wikivoyage, Wikisource and others".  As much of the primary development of Wikidata has been focused on the healthcare, bioinformatics and chemical communities, Wikidata can serve as a source of both primary and mapping information.  As an example the entry on Ebola virus [[https://www.wikidata.org/wiki/Q10538943]] provides a translation into 52 target languages and a map to equivalent identifiers in a variety of sources, including the International Committee on Taxonomy of Viruses (ICTV) id, the FDA Unique Ingredient Identifier (UNII), etc.

The Clinical Data Interchange Standards Consortium (CDISC) is a standards development organization (SDO) focused on "standards and innovations to streamline medical research and ensure a link with healthcare" [[wikipedia]] It is important to understand that what is (and is not) important to the medical research domain is different than what matters to the primary clinical practice.  The CDISC Biomedical Research Integrated Domain Group (BRIDG) Model "represents the realm of protocol-driven clinical, pre-clinical, translational and basic research" [[https://www.cdisc.org/standards/domain-information-module/bridg]] The model includes representations for studies, subjects, agents, products, activities, observations, adverse events and results.  Unless specified by a study, the bridge model does not include many of the concepts that are central to clinical care such as chief complaint, family relationships, current medications, allergies, etc.  In the context of the IDDO, one must realize that it may be quite difficult to integrate the data of different CDISC based studies, as each study defines its own criteria, observations, purpose, etc.  CDISC provides a common language for the design and exchange of study data, but the designers of different studies would have to work together if the information from multiple studies were to be combined or aggregated.

HL7 and HL7 FHIR.  Health Level Seven (HL7) began as a US-centric standards organization, but has since grown to be adopted and used globally.  HL7 defines several broad but related families of standard, including the HL7 Version 2 (V2) standard which specifies how clinical and healthcare information can be exchanged using a message idiom, the Clinical Document Architecture (CDA), which specifies how information is represented and exchanged using a structured document paradigm and FHIR, which standardizes the representation of healthcare information using the Resource Oriented Architecture (ROA) style [[Principled Design of the Modern Web Architecture (Fielding and Taylor, May, 2002)]][[Restful Web Services.  Richardson and Ruby. 2007]].  The primary focus of these three standards are (to date) primary clinical care.  The HL7 FHIR standard, in particular, is emerging as a de-facto model for the representation and exchange of clinical data.


Mention 13606[[http://www.en13606.org/]] or openEHR[[https://openehr.org/]]?


GeoNames - or another gazetteer i.e. name→location lookup

Open Street Map - geography data / Global Roads Data Set http://ciesin.columbia.edu/data/set/groads-global-roads-open-access-v1

The Gridded Population of the World dataset is published by the NASA Socioeconomic Data and Applications Center (sedac) and hosted by  Center for International Earth Science Information Network (CIESIN) is a center within the Earth Institute at Columbia University.  CEISIN serves as a jump-off point for a wealth of socioeconomic, geographic and environmental data. CEISIN datasets are well documented and are augmented with computable metadata.

https://healthsites.io - "Healthsites is working to establish a global commons of health facility data with OpenStreetMap. The International Hospital Federation have specified our current attribute list. We are currently re-visiting this list and looking at how it integrates with OpenStreetMap. In addition we are interested in understanding the OSM tagging structures that support Health outcomes for Women and Girls and people with Disabilities." [[https://github.com/healthsites/healthsites/wiki/Healthsites-data-model]]. 

HealthMap - (about)MRIIDS - Mapping the Risk of International Infectious Disease Spread (MRIIDS) (about) (appears to only deal with the 2014-2016 Ebola outbreak)