1 of 24

Reproducible bioinformatics

using pipelines and cloud -

in depth view

Nikita Shvetsov

Scientific developer / Senior engineer at UiT

2 of 24

Challenges

  • Data and processing platform

  • Pipeline framework for pre-processing

  • Reproducibility and documentation

3 of 24

Infrastructural

4 of 24

5 of 24

6 of 24

NOWAC (Norwegian women and cancer)

Id-lung

Tromsø Lung Study (i.c.w. Medsens.io)

7 of 24

NOWAC lab scheme

8 of 24

Lab interaction scheme (TLS - Medsens.io)

9 of 24

Pipeline platform

10 of 24

Core features

  • git-like

  • Incrementality

  • k8s and cloud by design

  • Reproducibility

  • Language-agnostic

  • Data provenance

11 of 24

12 of 24

13 of 24

Omics explanations and pipelines

14 of 24

15 of 24

mRNA preprocessing

16 of 24

miRNA preprocessing

17 of 24

DNAm processing

18 of 24

Microarray data

19 of 24

20 of 24

21 of 24

22 of 24

Data engineer

Researcher

23 of 24

24 of 24

nikita.shvetsov@uit.no

  • Infrastructure in place

  • Support many omics types

  • Already in use