1 of 14

Pandemic Research Infrastructure (PaRI) joins the NeIC affiliate programme

The one-year PaRI project focused on facilitating Nordic research on pandemics and especially the COVID-19 pandemic”

1 nov 2021–31 okt 2022

Presented by Wolmar Nyberg Åkerström (NBIS), NeIC AHM 2022-01-24

2 of 14

Infrastructure for pandemic data

  • COVID-19 workflows
  • National compute resources
  • Local infrastructure
  • Dashboard: Geographic distribution of virus variants over time
  • Guide to FAIR viral genome data

3 of 14

A guide to data sharing�

  • Provide guidance to projects, labs and other organisations producing or commissioning viral sequencing data
  • What constitutes good and ultimately (re)usable in a viral genome data record with a Nordic perspective?

4 of 14

Viral genome data sharing

Data deposit

Metadata �as XML

Compressed �data files

Data analyses

Raw reads

Submission �tooling

Sample description

5 of 14

A playbook for deploying Galaxy

6 of 14

Curated tools & workflows

7 of 14

Data flow for regional surveillance

8 of 14

PaRI Affiliate activities

Expanded professional network

  • FAIRification of pandemic related genomic data
  • Running and administering COVID-19 workflows in Galaxy
  • Curating pandemic data for visualisations in Nextstrain / Auspice (possibly other visualisation tools)

Targeted outreach

  • Onboarding new “partners” to our network
  • Merge deliverables into active projects and infrastructures

Access to tools / workflows

  • Populate a GitHub organisation (pari-neic) with tools and workflows from the partners

9 of 14

Pandemic Research Infrastructure (PaRI) joins the NeIC affiliate programme

Coordinator: Wolmar Nyberg Åkerström (NBIS)

Project owner: Abdulrahman Azab (NeIC)

Partners: NBIS, ETAIS, DTU, UiO/USIT, UiB

Contact: wolmar.n.akerstrom@uu.se

10 of 14

FAIR data sharing

Background from ‘FAIR Principles’ by Martínez-Lavanchy, et al (2019), CC-BY 4.0. doi:10.11581/dtu:00000049

Ronot: MetaManMachine by Nikola Vasiljevic (2021), CC BY-SA 4.0, doi:10.5281/zenodo.4471098

Works with my software

11 of 14

Supporting users FAIRifing data

Online entry points for users to get in contact with support staff related to PaRI are the Covid19 Data Portal network

  • Advice on localizing data
  • Annotating and submitting data
  • Contact information on how to get support

12 of 14

Infrastructure (non-sensitive data)

Analysis and storage services of PaRI for non-sensitive data:

  • A mix of ELIXIR developed and operated services
  • National generic e-infrastructure services
  • National Infrastructure for Research Data (NIRD)
  • Norwegian Research and Education Cloud (NREC)
  • SNIC Rackham
  • Estonian Scientific Computing Infrastructure (ETAIS)

13 of 14

Infrastructure (sensitive data)

The Federated EGA services depend on more generic services offered by existing e-infrastructure service providers in the Nordic countries: TSD in Norway and SNIC/Sunet in Sweden.

  • Tjenester for Sensitive Data (TSD)
  • SNIC Bianca
  • Sunet Cloud

14 of 14

Samples from Human Hosts

Sample description

  • A positive test result is information about a person’s health and thus it’s sensitive
  • May contain personal identity numbers and other identifiers for an individual
  • Demographic information, such as age, sex, location, date etc. could be used deduce the identity of an individual

Experiment description

  • May contain identifiers or references that could be used to deduce the identity of an individual

Raw reads

  • Genome fragments from the host can be used to identify an individual and should be considered sensitive