Services to support FAIR data
Draft report and recommendations

 
Provided by OpenAIRE, FAIRsFAIR, RDA Europe, FREYA, EOSC-hub

Introduction

This report highlights common challenges, priorities and a set of initial recommendations aimed at stakeholders in the scholarly world and particularly the European Open Science Cloud (EOSC) Governance. The report is the output of two workshops[1] designed to explore, discuss and propose recommendations on how existing data infrastructures can evolve and collaborate to provide services that support the implementation of the FAIR data principles, in particular in the context of building the EOSC.

The two workshops had slightly different objectives and targeted different communities. Service providers, research infrastructures and users in the research community considered implementation stories and shared their perspectives and vision of what it entails to produce FAIR data.

This report is presented in a draft version for discussion at a third workshop at the Open Science FAIR, September 2019. The final version of the report is intended for submission to the EOSC Working Group on FAIR.

Case studies

The case studies considered for the workshops covered different types of services supporting research data including data repositories, e-infrastructure and service providers. The services that were presented at the workshops are:

  1. EGI Datahub: Data management and exploitation[2] 
  2. Persistent Identifiers (PIDs) services for FAIR data[3]
  3. CoreTrustSeal: Certification of data repositories[4]
  4. Wikidata: Knowledge base[5] 
  5. Zenodo: Generic data repositories[6] 

Common questions were addressed by these case studies. The participants discussed the role of research institutions and other stakeholders, took stock of existing services and their maturity in enabling FAIR research outputs, and finally identified gaps and defined priorities to align and combine these services to provide a FAIR data ecosystem. Presentations and proceedings of the discussions in Vienna[7] and Prague[8] are available on the web.

Services supporting a FAIR data ecosystem

The case studies offered a good sample representing the minimum components of the FAIR data technical ecosystem identified in the report on Turning FAIR into reality[9]. The presentations and discussions during the workshops did however cover the broader scholarly ecosystem, recognising that FAIR data are part of a complicated landscape. The topics covered can be generally grouped into the following categories:

  1. Governance / Standards
  2. Discovery and publishing / Cataloguing / Persistent identifiers and linking research outputs
  3. Stewardship / Preservation / Metadata
  4. Quality / Trust
  5. Metrics / Monitoring / Maturity
  6. Sustainability and preservation
  7. Skills / Support

Participants identified key needs and areas of improvement across these general topic areas. Within the current landscape, some of the biggest gaps include:

  1. Missing a sustainable ecosystem of independent interoperable services with governance, business model(s) and shared responsibilities to support the creation of FAIR research outputs
  2. Addressing equally: 1) the principles related to findability and accessibility which requires mostly technical expertise that can be addressed by generic services (for e.g PIDs, cataloguing, discovery and storage) ; and 2) the principles related to interoperability and reuse which require services that cater to disciplinary needs with specific domain expertise (for e.g. ontologies, curation and stewardship provided by domain repositories)
  3. Skills and services for data stewardship and preservation are needed to maintain research outputs FAIR over time. Technical and conceptual expertise for data services is necessary.

Priorities and Recommendations

As an outcome of the workshops recommendations for services to support FAIR data were formulated by participants. They are stated below:

  1. Certification:
  1. Certification mechanisms and capability maturity models need to be further developed for and embraced by services to align with FAIR Principles
  2. Data repositories should undergo FAIR aligned certification such as CoreTrustSeal
  1. Essential infrastructure components:

Services supporting FAIR data should offer or make use of the following components:

  1. PID services for a wide range of objects, such as publications, researchers, data sets and organisations. Emerging PID types (e.g. for instruments) should be monitored and used when they are mature
  2. Domain-specific ontologies, as domain-specific requirements have to be taken into account
  3. Human and machine-readable standards to make datasets findable, reusable and interoperable (licences as one particular example of standards needed for machine readability)
  4. If applicable, metadata that complies with appropriate (domain) standards should be generated and captured automatically (for e.g by instruments)
  1. Stewardship:

To support the effective use and uptake of services enabling FAIR, institutions should:

  1. Establish data stewardship programmes providing simple and intuitive training for researchers, and enable data stewards and researchers who support applications of FAIR
  2. Support preservation and appraisal of research outputs: Improve and maintain FAIRness of data objects over time and the long-term usability and findability of datasets
  1. Costs:
  1. Determine the cost for services to align with FAIR principles including for data management support, maintenance and long-term preservation
  2. Develop a sustainable funding model (of services) taking into account that there might be additional costs for FAIR
  3. Provide support when determining the cost of data management as this is typically underestimated or unknown
  1. Rewards:
  1. Consider FAIR compliance and data sharing as part of research assessment, among other criteria
  2. References to use certified Trustworthy Digital Repositories (TDRs) in Data Management Plans should be recognised and recommended by funders
  1. Collaboration and support:
  1. Set-up and participate in cross-institutional, collaborative communities of practice to advance and implement FAIR services
  2. Foster global collaboration on FAIR implementation challenges and emerging solutions through organisations such as the Research Data Alliance
  3. Create practical guidelines on how to enable FAIR in repositories
  4. Provide skilled legal advisers in institutions to help in preparing robust DMPs
  1. Data management:
  1. There should be a data selection policy that – pre-deposit – recognises that not all research outputs must meet the highest levels of FAIRness, and recognizes· what has long term value, and has effect immediately after generation
  2. Data Management Plans should be required early when applying for funding and must have organisational relevance
  3. Legal aspects should be taken into account from the start of a project
  1. Addendum on EOSC:

Participants were also asked about their expectations from the EOSC Governance and data infrastructures. Among the main points raised were:

  1. A clarification of the cost models behind the use of the (EOSC) services as well as the granularity around how services can and will be procured
  2. The definition of the direction of the EOSC in 5-10 years time
  3. Infrastructures as custodians of standards

Conclusion

Based on the outcome of the two workshops, this draft report provides a platform to discuss priorities and recommendations for services to support FAIR data. Existing EOSC plans and projects can help to further develop and implement the recommendations and fill the gaps.

https://lh4.googleusercontent.com/y0SWvDgSGCHDvws0nho40F7ip3aBsSlJegZC6YG3r5ocD6Wp1nfQZ_vj9mIEapmsntcrzeYkawIkEzWdWv4blUvm7-Sye1vS_MC5mgxlww6XwhQIwnBMAyDJ4ttj0Qcn94BzZTtm


[1] Workshop 1, April 12 2019, Prague (EOSC-hub week https://www.eosc-hub.eu/events/eosc-hub-week-2019/programme/services-support-fair-data ). Workshop 2, April 24 2019, Vienna (Linking Open Science in Austria https://linkingopenscience.univie.ac.at/agenda/).

[2] https://datahub.egi.eu slides form the presentation: www.slideshare.net/eoschub-services

[3] This use case was presented by the FREYA project. See: https://www.project-freya.eu/en slides from the presentation https://www.slideshare.net/freya-services

[4] https://www.coretrustseal.org/ slides from the presentation  www.slideshare.net/core-trust-sea

[5] https://www.wikidata.org slides from the presentation: www.slideshare.net/wikidata

[6] https://zenodo.org/ slides from the presentation: www.slideshare.net/zenodo

[7] https://www.openaire.eu/openaire-workshop-making-services-fair-vienna-april-24th-2019

[8] https://www.openaire.eu/report-services-to-support-fair-data-from-theory-to-implementation 

[9] Turning FAIR into reality: Final report and action plan from the European Commission expert group on FAIR data, EU 2018. https://doi.org/10.2777/1524