1 of 29

AIDA Data Hub

Data Science Platform

A home for your research.

2 of 29

AIDA Data Hub

Services for Clinical Innovation in Data Driven Precision Health

National e-infrastructure supporting the Analytic Imaging Diagnostic Arena (AIDA)

Hosted by LiU and the Center for Medical Image Science and Visualization (CMIV)�Part of SciLifeLab Bioinformatics platform (NBIS)

250319 Welcome to the Launch of the AIDA Data Hub DSP

3 of 29

AIDA & AIDA Data Hub

AIDA Community - medtech4health.se/aida

National collaboration arena for AI research and innovation in medical imaging diagnostics.

AIDA Data Hub - datahub.aida.scilifelab.se

E-infrastructure set up to support AIDA.

4 of 29

AIDA Data Hub

E-infrastructure for research and clinical innovation in data driven precision health.

Data services

  • Access high quality datasets
  • FAIR sharing of DOI citable datasets
  • Extract and enrich clinical data for research

Data Science Platform for Sensitive data

  • Secure long term primary storage & compute.
  • Advanced usage patterns: �collaborate, annotate, federate, train AI, ...

Support

  • Data sharing, ethics, legal, and policy
  • AI development & System design

5 of 29

6 of 29

AIDA Data Hub

Started in 2018, as part of AIDA.

Support in sharing medical imaging data.

Policy support in sensitive data for research.

Support to ethical review applications.

Agreement support.

Template agreements for data sharing and transfer.

2018

7 of 29

Data Sharing Policy

AIDA Data Sharing Policy

Comprehensive resource describing best practices in handling and sharing medical imaging data for research in Sweden and similar countries.

Concrete guidelines and examples, with references to original sources in law.

Developed multi-professionally with AIDA network and stakeholders.

Key insights have been published in Nature Scientific Data.

2019

8 of 29

Using Clinical Imaging Data for Research

Common practice in Sweden and similar countries, 1-paragraph summary:

�The common practice is that caregivers disclose data to research institutions for specific activities described in approved ethical review applications, to be carried out under appropriate technical and organizational protective measures and supervised by a named competent researcher. The research institution is then data controller and copyright holder for the disclosed data, and is responsible for ensuring that data is processed and shared only as described in the approved ethical review application, with data processing agreements, pseudonymization, anonymization and licensing as tools, and with an obligation to store relevant data for 10 years after last use for purposes of research validation.

9 of 29

10 of 29

Data In

Datasets

Scans

Annotations

Size

Total

50

154917

48787

56.95TB

15

6590

48387

2.15TB

11

13240

34186

10.86TB

39

141677

14601

44.47TB

2

106448

1006448

124.32GB

11 of 29

Data Out

Metrics:

  • Countries: 44
  • External sharing events: 287

12 of 29

Tryggve�Nordic collaboration on �Sensitive personal data for research

Joel Hedlund, Executive manager

Joel Hedlund

13 of 29

NeIC and Nordic ELIXIR nodes.

Collaboration between Nordic platforms for research using sensitive personal biomedical data.

Tryggve 1: Processing platform development.

Tryggve 2: Sensitive data archiving - Federated EGA.

Joel Hedlund

14 of 29

Nordic Register Genomics in Psychiatry - A Tryggve2 research study��

Lu Yi, PhD. lu.yi@ki.se

Patrick Sullivan, Prof.

Psychiatric Genomics Institute,

Karolinska Institutet

15 of 29

Schizophrenia Basics

  • Delusions & hallucinations, no known cause (minimum duration 6 months)
  • Massive
    • Morbidity: top 10 in world
    • Mortality: life expectancy 10-15 years less
    • Costs (personal/familial/societal): $US 1.4M/life
  • Intractable to extensive scientific study
  • Subtle processes

Website & Contact: neic.no/tryggve/

16 of 29

Schizophrenia Genetics

A major clue, from generations of past work.

Probabilistic not deterministic:

  • Family history, 10x increase (but 1% à 10%)
  • MZ twins, risk to co-twin ~50%
  • Heritability ~ 80%

No convincing single gene causes.

Website & Contact: neic.no/tryggve/

17 of 29

Nordic registers

  • In-/out-patient register
  • Prescription drug register
  • Medical birth register
  • Multi-generation register
  • Social insurance register
  • Cause of Deaths register

Website & Contact: neic.no/tryggve/

18 of 29

Study Numbers

Descriptor

Denmark

Norway

Sweden

TOTAL

Vital statistics: Q4 2017

– total population

5,781,190

5,295,619

10,120,242

21,197,051

– births

61,397

56,633

115,416

233,446

– foreign born (%)

0.085

0.141

0.185

0.137

Register analyses

– lifetime Schizophrenia (SCZ)

36,676

9,002

29,072

74,750

– lifetime Major depression (MD)

75,771

87,540

595,743

683,283

– lifetime Postpartum depression (PPD)

50,176

8,572

93,960

152,708

– lifetime Eating disorders in females (ED)

21,816

4,857

34,238

60,911

Microarray data: Q2 2018

– Ns with GWAS

89,273

2,850

183,966

276,089

– SCZ cases

5,247

800

4,924

10,971

– MD cases

25,431

0

5,059

30,490

– PPD cases

1,600

0

1,381

2,981

– ED cases

5,114

0

4,118

9,232

Microarray data: Q4 2021

– Ns with GWAS

425,000

386,000

300,000

1,111,000

– SCZ cases

9,622

2,240

12,000

23,862

– MD cases

45,701

11,750

10,000

67,451

– PPD cases

3,600

1,000

2,881

7,481

Website & Contact: neic.no/tryggve/

19 of 29

Tryggve2

A federated system that enables data sharing and analysis in a secure, streamlined & intelligent way

#2-7

Distributed compute solution via singularity container

Website & Contact: neic.no/tryggve/

20 of 29

AIDA Data Hub�Sensitive Data Services

First GPU system in Sweden for research using sensitive personal biomedical data.

Set up at CMIV, in collaboration with Nvidia, serving researchers in the AIDA community.

Hosted in Region Östergötland secure data centers hosting for hospital electronic health record production systems.

Hosting VINNOVA funded SCAPIS data lab, where AI researchers can securely process SCAPIS data for research.

AIDA DGX-2 Service

Service for best-in-class researchers in �Swedish medical imaging diagnostic AI.�Secure enough for medical personal data.

2020

21 of 29

AIDA Data Hub�Data Science Platform

Secure data science platform co-located with national/European flagship compute systems.

Supporting advanced data usage patterns:�long term primary storage, collaborate, annotate, share, federate, train AI...

Customers make security decisions as appropriate; outgoing connections to home institution servers, collaborators...

User fees for sustainable operations and development. Discounts to incentivize data sharing and maximize high impact research.

Based on Bigpicture/GDI technologies.

2025

22 of 29

AIDA Data Hub contribution to Arrhenius sensitive data

NAISS Arrhenius: next flagship system for academic computing in Sweden, part of EuroHPC.

AIDA Data Hub has contributed to the sensitive data capability procurement group, with experience from systems for sensitive data and AI innovation in medical imaging diagnostics.

10% of budget to sensitive data support.

23 of 29

AI Factories for Life Science

2+ BEUR investment.

SE system Mimer to be operated by NAISS, in collaboration with RISE.

To be built in the same computer rooms as AIDA Data Hub DSP, and offer similar functionalities.

Recruitment of 52 FTE ongoing, to be in place in 2025h1.

24 of 29

Bigpicture Petabyte platform for European digital pathology AI

AIDA Data Hub leading repository infrastructure development, which is carried out in� collaboration with sensitive data teams at the NBIS Systems Development unit and CSC.fi.

Establishing Bigpicture Federated node on AIDA Data Hub Data Science Platform.

25 of 29

EUCAIM European Federation for Cancer Images

AIDA Data Hub Data Science Platform to support EUCAIM use cases. �� Collaboration with sensitive data teams at the NBIS Systems Development unit.

26 of 29

ASHA - Använda Standardiserade Hälsodata som Accelerator

RÖ led VINNOVA Systems demonstrator for Data lake systems for primary and secondary use.�AIDA Data Hub Data Science Platform provides spaces for secondary use.

27 of 29

28 of 29

Thank you!

AIDA Data HubServices for Clinical Innovation in Data Driven Precision Health

National data infrastructure supporting the Analytic Imaging Diagnostic Arena (AIDA)

Hosted by LiU and the Center for Medical Image Science and Visualization (CMIV)�Part of SciLifeLab Bioinformatics platform (NBIS)

29 of 29