Snapshot: Concerns relating to the governance of FAIR vocabularies

A strawman list of ‘concerns’ relating to the maintenance and governance of FAIR vocabularies was provided by the international CODATA team responsible for Ten Simple Rules for making a vocabulary FAIR. This was tested and revised by participants in the Australian workshop. The table below is a snapshot of the consensus at the end of the workshop.

1

Scope and context

 

1a

What is the vocabulary, its mission/scope

What are its application/s or use cases?

 

1b

Vocabulary status
(stable, in development, experimental, planned…)

 

1c

Identify 

  • dependencies on other vocabularies (e.g. 

  • vocabulary ecosystem that this vocabulary is a part of

 

2

Stakeholders

 

2a

Identify key stakeholders in the vocabulary and their role. Potential roles (definitions by Simon)

  • Users - the primary community that needs this vocabulary

  • Owner - Entity with ongoing legal responsibility for the content 

  • Steward - content manager or gatekeeper responsible for managing the (consensus?) process for creating and updating the vocabulary, including liaison with the owner and user community

  • Contributor - person or organization authorized to propose changes to the vocabulary contents

  • Reviewer - domain expert who contributes to the change evaluation process

  • Custodian - technical manager - responsible for converting content to machine-readable form, for loading to source-of-truth and propagating it to the FAIR publication system

N.B.: A stakeholder may play more than one role 

 

Users’ will typically be defined by some intersection of (a) domain or discipline (b) jurisdiction or location (c) economic or organizational sector (d) organization, sub-organizational unit, laboratory, etc (e) project or activity

N.B.1: A controlled vocab of roles is needed. Possible inputs:

Contributor Roles Crosswalk compares CREDiT, OBO, DDI, DataCite and ISO roles. MARC relators includes a more comprehensive set of roles

N.B.2: Roles may be delegated, including for for subsets or parts of a vocabulary. However, the delegator is still ultimately responsible

N.B.3: Roles and stakeholders will evolve if scope evolves

2b

Identify other stakeholders

  • Vocabulary service administrator - responsible for technical maintenance of FAIR  hosting system

  • Vocabulary service host - owner of hosting system 

  • Sponsor or authorizing agent

  • Funder

  • Regulator 

  • External content endorser

  • Maintainer of a dependant vocabulary - this is a special case of user (and may not be known about)

Ben Wu - Consideration here as to the type of user and the style of their approach eg. in data management - security, classification and content vocabularies

Anu D - N.B. It may be worthwhile to consider profiling of those stakeholders - for example - what level of technical and content capability is a good examples.

Len Smith - Question (flowing from the HASS BoF at eREsearch) - who is, or who should be, responsible for development of standards and vocabularies. Does reflect on the likely contributions and authority structures within a given stakeholder community

@Len - that must be a decision for the user community. (If the key user is government (because the codes relate to regulation or expenditure) then they may assume authority.)

3

Content Management

 

3a

Identify the source-of-truth or reference copy for the vocabulary content
e.g. enterprise database, private spreadsheet, shared spreadsheet, web-page, book, PDF, file in VCS

For the vocabulary managers (steward, custodian) this is the artefact which serves as the source for all other artefacts and representations

Note: Vocabulary users will usually see another specific artefact or access point as the ‘point-of-truth’ i.e. the thing which they must conform to. For example, this could be a web-page, web-service, PDF, book …

3c

Establish a mechanism for recording details of any changes and revisions

  • Inside the dataset, or external?

  • What granularity? - vocabulary-as-a-whole, term, axiom

L Wyborn - We should differentiate between changes in the content, and changes in the technical way the vocabulary is delivered

Kheeran D - In 5., it seems very human centric. I think there are two more (related) concerns that should be addressed here:

o Does the changes to this vocab need to be machine readable and machine comprehensible?

o If so, how is the change communicated in a way to be machine comprehensible?

Ben Wu - Changes in locations of content - eg deprecation, superseding etc  themselves and change to content (e.g. descriptions)  - how are users alerted?

4

Revisions and change-requests

 

4a

Determine the revision schedule (e.g., on-demand, yearly, quarterly..) or a trigger mechanism

 

4b

What is the mechanism to request changes 

  • Who is authorized to make requests?

  • How are change requests made (eg: Github issues, email to group/person, web-form, via a service desk etc.) ? 

  • Are requests public or private?

Kheeran D - As this template is a guide of things that should be considered in vocab governance, I suggest we change the language here to be more enquiring rather than specifying.

4c

Define the process for evaluation of change requests

  • How is consensus reached or decisions made?

  • How is the evaluation and outcome recorded?

  • Is a reference panel or external peer review involved?

 

5

Implement and Communicate Changes

 

5a

Implementing changes

  • Add recommendations on how to denote and describe versions and releases (including time-stamping)

  • Add recommendations on how to describe release dependencies

Michael Lawley - Denote and describe versions and releases  - A versioning scheme and any rules about compatibility between versions. (e.g., Sem-Ver).  This probably overlaps with 5 as well.



Michal Lawley - release dependencies - Is this about dependencies on, for example, other vocabularies that might be referenced in properties or maps?

KheeranD: Following on from Michael’s comment about dependencies, there needs to be something here to tie it back to what has been identified in 1c.

These are instructions for the user of this template, rather than things for consideration by the template user.  This differs from the style of the rest of the template. I think they should be reframed as things for consideration by the template user and perhaps as advice as to what they should consider having in place.

5b

Propagating changes

  • Include how to do comparison between versions of the vocabulary 

  • can be used for release notes

  • explain how changes are propagated to copies and caches

Mark Lindsay - what about limited releases prior to a general release - esp. with vocabs with critical dependencies

    KheeranD - Good point Mark.  This also speaks to different levels of maturity.  Many vocabs probably won’t have that kind of distinction needed.  But some do.  Advice, examples should try to cover these different cases to inform the template user.

5c

Announcing and advertising changes

  • Mailing list

  • Official channels

  • Which parties? Which roles?

  • Who is responsible? 

  • explain how changes are propagated to copies and caches, and to maintainers of dependant systems

 

6

Persistence and sustainability

 

6a

What is the sustainability plan (i.e., in terms of resources and processes) for the vocabulary?

What gives this the best chance of this vocabulary surviving?

Who cares? 

Links to some things you would think about in a sustainability plan for vocabs of different maturity/for different purposes….

6b

Is the content archived

Archived versions (examples of archival options) - can the content be recovered after the FAIR version dies? 

Some resources on sustainability plans of some vocab services (e.g. RVA)

Separate URIs from URLs

  • Examples needed

M Wong - can we make clearer/example what is meant by ‘is this content archived’? - referring to machine readable vocabulary (clarify from outset) - is this meaning a static version?

MWong - could you separate out ‘is the content archived’ as a separate question, maybe?

KheeranD - As part of governance, it is worth explicitly considering the ‘retirement plan’ for a vocab.  How can a vocab be gracefully retired from service?  This may be due to it being superseded by a different vocab, or it could have outlived its usefulness, or it could be that it is no-longer actively maintained, or ...

A summary of the analysis using the vocabulary case-studies is attached.

General feedback



The examples to help guide users: being compiled in excel - to demonstrate different levels of ‘robustness’ of governance required and in place

Kheeran D - Requires explanation for each question. While the template worked for the workshop, if we are going to put this out for more general consumption as an output in a form that others can make use of it, then each question requires some explanation to help the reader.

MWong - could we map these against principles of trust they support (such as CoreTrust)?

Kheeran D - ·         Should there be consideration given to licensing terms for the vocab? Again, from the Munsell colours experience, I think this may be in two parts.

o What is the license of the vocab 'source of truth'? ie: what is the license that the vocab-manager will need to abide by?

o What is the license for the end user of the vocab? ie: what is the license that the FAIR vocab is provided under for consuming it?



Kheeran D - I wonder if it is useful to structure the template in such a way that enables people to identify the current situation and be able to consider any gaps in it for the purpose that they want to use it?

Two things would help this:

  1. A set of contrasting examples that show different types of governance structures that have been adopted.

  2. Discussion of the kind of concerns.

Kheeran D - Also, the work in Name types, AS4590 and ANZSCO all have valuable annotations that are worth consolidating into the main template for further consideration.

I also think that it may be useful if these concerns are presented against the strength (robustness?) of governance required.  Eg:

  • Simple governance needs (eg: Munsell Colours?)

  • ?

  • ?

  • ?

  • Strong governance needs (eg: SNOMED, ANZSCO)

M Wong - RDA FAIR maturity model status may be useful for each question to indicate where each question is at for a particular point in time - like 'have not considered', 'inapplicable' ‘in planning phase’ or ‘implementation phase’ etc..... 

Kheeran D:

Few more points from me after another read/review today:

  • As part of the governance, I think it is useful to give consideration to the criticality of the vocab from a user impact perspective.  For example, the impact of changes to ANZSCO is pretty high as people's lives, or income could depend on it.  In contrast the historical police districts are likely to have much less dramatic impact on the users.  This is useful as it provides context and guidance when considering the governance one wants to put in place for the vocab being considered.

  • I wonder if it is worth considering these concerns from two viewpoints?

    • The viewpoint of the creator and maintainer of the vocab.

    • The viewpoint of the publisher of the vocab.

I think they have slightly different concerns and may provide more clarity when thinking about the governance concerns.

  • From a use of the tool perspective, I think it would be useful to have a 3rd column in the template that allows the template user to describe the desired state or the gap between the current state and the desired state.

  • I think there is a related activity to be done sometime to develop a set of common/ideal patterns that can help inform and guide the thinking as these governance concerns are being considered for a vocab.

  • There needs to be some editorial work done at some point to keep the style/language of the questions in a consistent form across the template.  It's a bit varied at the moment.

 

KD - Consistency across qs re how they are worded/bring presented - as instructions, or a series of prompting questions/things to consider