AIDA Data Hub
Services for Clinical Innovation in Medical Imaging Diagnostic AI.
National data infrastructure supporting the Analytic Imaging Diagnostic Arena (AIDA)
Hosted by LiU and the Center for Medical Image Science and Visualization (CMIV)�Funded by SciLifeLab Bioinformatics platform (NBIS)
240207 Sensitive Data Services 2.0 for AIDA Days at CMIV
Secure AI training systems
Set up at CMIV in collaboration with Nvidia serving researchers in the AIDA community.
Hosting VINNOVA funded SCAPIS data lab, where AI researchers can securely process SCAPIS data for research.
15M extension during 2024: more hardware, usable by wider range of professionalities.
LiU procurement unit engaged.
AIDA DGX-2 Service
Service for best-in-class researchers in �Swedish medical imaging diagnostic AI.�Secure enough for medical personal data.
AIDA Data Hub SDS 2.0
Next generation Sensitive Data Services.
Based on Bigpicture/GDI technologies.
Procurement underway, with co-funding to support several communities:
Discussions with further co-funders are ongoing.
SDS 2.0 Establishment
Now: dialogue with LiU procurement.
Hardware installation and progressive rollout of services in 2024q2.
Basic services for technical experts.
Progressively more advanced services for a progressively broader audience.
Service delivery roadmap and iterative development priorities will be based on continuous stakeholder dialogue.
SDS 2.0 Customer model
Ethically approved research project: research institute represented by a competent researcher (PI).
Customer segmentation: you cannot see other customers, they cannot see you.
Customer makes security decisions appropriate for their project.
SDS 2.0 Infrastructure
Secure enough for large amounts of information of extreme confidentiality.
Still physically located in RÖ secure data centers for EHR production systems, in a submarine in a bunker, with guards.
Petabyte ceph object storage.
OpenStack platform, providing virtual machines for GPU/CPU compute.
Kubernetes platform for containerized services (Cytomine, jupyter, ...).
SDS 2.0 Planned services
Sensitive data processing, CPU/GPU.
Sensitive data sharing.
Sensitive data primary storage.
Secure private remote desktop.
Private user admined VMs over SSH.
Private web applications.
Private PACS.
Trusted medical imaging import.
SDS 2.0 Access modes
Not just for expert AI developers.
More types of user, including clinicians.
Multi factor Life Science Login with your home organization account, no VPN.
Customer managed groups using Perun.
Secure linux remote desktop as default.
SaaS applications: PACS, Cytomine, jupyter notebooks, etc...
SSH access to user admined VM for expert users.
<home organization>
Perun
SDS 2.0 Business model
Base model: User fees.
Service portfolio priced for sustainable operations and development.
Yearly membership fee provides basic service for typical research projects.
Additional services cost extra, e.g. GPU, primary storage, Sectra PACS, etc.
Fee waivers offered for co-funding communities and data sharing parties.
Discounts will be applied whenever possible to maximize high impact research output.
€€€
€
SDS 2.0 Basic service
Tentative fee: 50 kSEK/yr.
Up to 2 TB quota on private project storage (no backup) accessible through e.g. Windows file sharing.
Multifactor login using Life Science AAI and your home organization account.
Access to shared datasets on approval, does not count toward project storage quota.
SDS 2.0 Add-on services
Backed up primary storage�~2.5 kSEK/TB/yr
Large volume project storage�~1.5 kSEK/TB/yr
Large scale CPU compute�24 kSEK/CPU/yr
GPU compute�80 kSEK/GPU/yr�
Data sharing: Free of charge�Help build the data lake / data commons.
SDS 2.0 Data sharing
FAIR data sharing with the world.
Make high-quality datasets citable using Digital Object Identifiers and Search Engine Optimized landing pages.
Personal data or anonymized data.
Manage access requests using Resource Entitlement Management System, or delegate handling to the AIDA Data Hub Data Access Committee.
Bigpicture download services.
SDS 2.0 Upcoming services
Secure remote desktop�Intended default interface.
Authorized data import/exports�PI can delegate import/export rights.
Telerad/DICOM destination�Receive images from specified scanners.
Sectra PACS�Project private Sectra PACS.
Bigpicture Petabyte platform for European digital pathology AI
AIDA Data Hub leading repository infrastructure development, which is carried out in� collaboration with sensitive data teams at the NBIS Systems Development unit and CSC.fi.
� Large scale archive operations started Mar 2023.
EUCAIM Federated infrastructure for cancer imaging data
AIDA Data Hub contributing data collaboration workspaces for use in EUCAIM� with cancer imaging data based on Bigpicture Federated node technologies. �� Collaboration with sensitive data teams at the NBIS Systems Development unit.
ASHA - Använda Standardiserade Hälsodata som Accelerator
RÖ led VINNOVA Systems demonstrator for Data lake systems for primary and secondary use.�AIDA Data Hub provides spaces for secondary use.
Thank you!
AIDA Data Hub�Services for Clinical Innovation in Medical Imaging Diagnostic AI
National data infrastructure supporting the Analytic Imaging Diagnostic Arena (AIDA)
Hosted by LiU and the Center for Medical Image Science and Visualization (CMIV)�Part of SciLifeLab Bioinformatics platform (NBIS)
Questions?
Extra slides in case of questions
SCAPIS Image data sharing
through AIDA Data Hub
All SCAPIS imaging data to be shared through AIDA Data Hub (~100 TB) as �24 datasets.
Legal agreements being prepared.
Launch originally planned for AIDA Days in Gothenburg in Oct.
Tech solution is production ready.
Demo today.
Process overview
You ask SCAPIS for access to datasets.
SCAPIS tells us to give you access.
You get the data from AIDA Data Hub.
In more detail
1. Researcher finds data
Use a normal web browser to search for good data.
The top hit is a landing page that describes a dataset on the platform.
The landing page is easy to find, because the page page is made easy to understand for computers, aka "search engine optimised" using schema.org LD-JSON.
Researcher
1. Researcher finds data
The landing page has basic information on the dataset.
It explains why you should bother applying for access.
Note: Google picks up our sample images, and shows them already in their search results.
The "Apply for access" button takes you to SCAPIS.
Researcher
Apply for Access
2. Researcher applies for access
The researcher goes through normal SCAPIS procedures to apply for access to the dataset.
Researcher
SCAPIS
§
?
3. SCAPIS approves access
SCAPIS goes through the normal access request evaluation procedures, and approves the request.
§
SCAPIS
Researcher
👍
4. SCAPIS tells AIDA Data Hub to
give access
SCAPIS instructs AIDA Data Hub to give the researcher access to the dataset.
§
SCAPIS
AIDA Data Hub
👍
5. Researcher gets account at
AIDA Data Hub
High security service.
Three-factor authentication 2fa SSLVPN + ssh pubkey.
NG SDS will support Life Science AAI, using your normal institutional login.
AIDA Data Hub
AIDA DGX-2 Service
Service for best-in-class researchers in �Swedish medical imaging diagnostic AI.�Secure enough for medical personal data.
Researcher
?
6. Researcher downloads data
Insert live demo here.
Researcher
7. Optional: Researcher joins AIDA� and uses platform compute power
�Current interface is "ssh tunnel + bash".
SDS 2.0 will offer more types of interface, suitable for wider ranges of professionalities and competencies.
AIDA DGX-2 Service
Service for best-in-class researchers in �Swedish medical imaging diagnostic AI.�Secure enough for medical personal data.
Researcher
PIECE OF CAKE! :D