1 of 31

AIDA Data Hub

Services for Clinical Innovation in Medical Imaging Diagnostic AI.

National data infrastructure supporting the Analytic Imaging Diagnostic Arena (AIDA)

Hosted by LiU and the Center for Medical Image Science and Visualization (CMIV)�Funded by SciLifeLab Bioinformatics platform (NBIS)

240207 Sensitive Data Services 2.0 for AIDA Days at CMIV

2 of 31

Secure AI training systems

Set up at CMIV in collaboration with Nvidia serving researchers in the AIDA community.

Hosting VINNOVA funded SCAPIS data lab, where AI researchers can securely process SCAPIS data for research.

15M extension during 2024: more hardware, usable by wider range of professionalities.

LiU procurement unit engaged.

AIDA DGX-2 Service

Service for best-in-class researchers in �Swedish medical imaging diagnostic AI.�Secure enough for medical personal data.

3 of 31

AIDA Data Hub SDS 2.0

Next generation Sensitive Data Services.

Based on Bigpicture/GDI technologies.

Procurement underway, with co-funding to support several communities:

  • AIDA (2.5 MSEK)
  • EUCAIM (1 MSEK)
  • ASHA / ÖHDS (8 MSEK)
  • CMIV (3 MSEK)

Discussions with further co-funders are ongoing.

4 of 31

SDS 2.0 Establishment

Now: dialogue with LiU procurement.

Hardware installation and progressive rollout of services in 2024q2.

Basic services for technical experts.

Progressively more advanced services for a progressively broader audience.

Service delivery roadmap and iterative development priorities will be based on continuous stakeholder dialogue.

5 of 31

SDS 2.0 Customer model

Ethically approved research project: research institute represented by a competent researcher (PI).

Customer segmentation: you cannot see other customers, they cannot see you.

Customer makes security decisions appropriate for their project.

6 of 31

SDS 2.0 Infrastructure

Secure enough for large amounts of information of extreme confidentiality.

Still physically located in RÖ secure data centers for EHR production systems, in a submarine in a bunker, with guards.

Petabyte ceph object storage.

OpenStack platform, providing virtual machines for GPU/CPU compute.

Kubernetes platform for containerized services (Cytomine, jupyter, ...).

7 of 31

SDS 2.0 Planned services

Sensitive data processing, CPU/GPU.

Sensitive data sharing.

Sensitive data primary storage.

Secure private remote desktop.

Private user admined VMs over SSH.

Private web applications.

Private PACS.

Trusted medical imaging import.

8 of 31

SDS 2.0 Access modes

Not just for expert AI developers.

More types of user, including clinicians.

Multi factor Life Science Login with your home organization account, no VPN.

Customer managed groups using Perun.

Secure linux remote desktop as default.

SaaS applications: PACS, Cytomine, jupyter notebooks, etc...

SSH access to user admined VM for expert users.

<home organization>

Perun

9 of 31

SDS 2.0 Business model

Base model: User fees.

Service portfolio priced for sustainable operations and development.

Yearly membership fee provides basic service for typical research projects.

Additional services cost extra, e.g. GPU, primary storage, Sectra PACS, etc.

Fee waivers offered for co-funding communities and data sharing parties.

Discounts will be applied whenever possible to maximize high impact research output.

€€€

10 of 31

SDS 2.0 Basic service

Tentative fee: 50 kSEK/yr.

Up to 2 TB quota on private project storage (no backup) accessible through e.g. Windows file sharing.

Multifactor login using Life Science AAI and your home organization account.

Access to shared datasets on approval, does not count toward project storage quota.

11 of 31

SDS 2.0 Add-on services

Backed up primary storage�~2.5 kSEK/TB/yr

Large volume project storage�~1.5 kSEK/TB/yr

Large scale CPU compute�24 kSEK/CPU/yr

GPU compute�80 kSEK/GPU/yr

Data sharing: Free of charge�Help build the data lake / data commons.

12 of 31

SDS 2.0 Data sharing

FAIR data sharing with the world.

Make high-quality datasets citable using Digital Object Identifiers and Search Engine Optimized landing pages.

Personal data or anonymized data.

Manage access requests using Resource Entitlement Management System, or delegate handling to the AIDA Data Hub Data Access Committee.

Bigpicture download services.

13 of 31

SDS 2.0 Upcoming services

Secure remote desktop�Intended default interface.

Authorized data import/exports�PI can delegate import/export rights.

Telerad/DICOM destination�Receive images from specified scanners.

Sectra PACS�Project private Sectra PACS.

14 of 31

Bigpicture Petabyte platform for European digital pathology AI

AIDA Data Hub leading repository infrastructure development, which is carried out in� collaboration with sensitive data teams at the NBIS Systems Development unit and CSC.fi.

Large scale archive operations started Mar 2023.

15 of 31

EUCAIM Federated infrastructure for cancer imaging data

AIDA Data Hub contributing data collaboration workspaces for use in EUCAIM� with cancer imaging data based on Bigpicture Federated node technologies. �� Collaboration with sensitive data teams at the NBIS Systems Development unit.

16 of 31

ASHA - Använda Standardiserade Hälsodata som Accelerator

RÖ led VINNOVA Systems demonstrator for Data lake systems for primary and secondary use.�AIDA Data Hub provides spaces for secondary use.

17 of 31

Thank you!

AIDA Data HubServices for Clinical Innovation in Medical Imaging Diagnostic AI

National data infrastructure supporting the Analytic Imaging Diagnostic Arena (AIDA)

Hosted by LiU and the Center for Medical Image Science and Visualization (CMIV)�Part of SciLifeLab Bioinformatics platform (NBIS)

18 of 31

Questions?

19 of 31

Extra slides in case of questions

20 of 31

SCAPIS Image data sharing

through AIDA Data Hub

All SCAPIS imaging data to be shared through AIDA Data Hub (~100 TB) as �24 datasets.

Legal agreements being prepared.

Launch originally planned for AIDA Days in Gothenburg in Oct.

Tech solution is production ready.

Demo today.

21 of 31

Process overview

You ask SCAPIS for access to datasets.

SCAPIS tells us to give you access.

You get the data from AIDA Data Hub.

22 of 31

In more detail

  1. Researcher finds data
  2. Researcher applies for access
  3. SCAPIS approves access
  4. SCAPIS tells AIDA Data Hub to give access
  5. Researcher gets account at AIDA Data Hub
  6. Researcher downloads data
  7. Optional: Researcher joins AIDA and uses on-platform compute power

23 of 31

1. Researcher finds data

Use a normal web browser to search for good data.

The top hit is a landing page that describes a dataset on the platform.

The landing page is easy to find, because the page page is made easy to understand for computers, aka "search engine optimised" using schema.org LD-JSON.

Researcher

24 of 31

1. Researcher finds data

The landing page has basic information on the dataset.

It explains why you should bother applying for access.

Note: Google picks up our sample images, and shows them already in their search results.

The "Apply for access" button takes you to SCAPIS.

Researcher

Apply for Access

25 of 31

2. Researcher applies for access

The researcher goes through normal SCAPIS procedures to apply for access to the dataset.

Researcher

SCAPIS

§

?

26 of 31

3. SCAPIS approves access

SCAPIS goes through the normal access request evaluation procedures, and approves the request.

§

SCAPIS

Researcher

👍

27 of 31

4. SCAPIS tells AIDA Data Hub to

give access

SCAPIS instructs AIDA Data Hub to give the researcher access to the dataset.

§

SCAPIS

AIDA Data Hub

👍

28 of 31

5. Researcher gets account at

AIDA Data Hub

High security service.

Three-factor authentication 2fa SSLVPN + ssh pubkey.

NG SDS will support Life Science AAI, using your normal institutional login.

AIDA Data Hub

AIDA DGX-2 Service

Service for best-in-class researchers in �Swedish medical imaging diagnostic AI.�Secure enough for medical personal data.

Researcher

?

29 of 31

6. Researcher downloads data

Insert live demo here.

Researcher

30 of 31

7. Optional: Researcher joins AIDA� and uses platform compute power

�Current interface is "ssh tunnel + bash".

SDS 2.0 will offer more types of interface, suitable for wider ranges of professionalities and competencies.

AIDA DGX-2 Service

Service for best-in-class researchers in �Swedish medical imaging diagnostic AI.�Secure enough for medical personal data.

Researcher

31 of 31

PIECE OF CAKE! :D