1 of 35

ga4gh.org

2 of 35

Beacon Extensions

Chair: Jordi Rambla

ga4gh.org

3 of 35

Agenda

Topic

Presented by

Time

General introduction

Jordi Rambla

5’

1.1

OMOP

Sergi Aguiló-Castillo

15’

2.0

RNAGet

Emilio Palumbo

15’

2.1

EJP-RD

Tony Brookes

15’

3.0

Cancer registries

Jordi Rambla

15’

3.1

Beacon 4 images

Jordi Rambla

15’

4.0

Wrap-up

Jordi Rambla

10’

ga4gh.org

4 of 35

Beacon-OMOP CDM

Sergi Aguiló-Castillo

This work has been funded by the Institute of Health Carlos III (project IMPaCT-Data, exp. IMP/00019), co-funded by the European Union, European Regional Development Fund (ERDF, “A way to make Europe”).

ga4gh.org

5 of 35

OMOP CDM

Observational Medical Outcomes Partnership (OMOP) Common Data Model (CDM)

  • Merge disperse databases into a common format (data model).
  • Common representation (terminologies, vocabularies, coding schemes).
  • Widely used to share secondary use of research data in hospitals around the world.

ga4gh.org

6 of 35

Beacon RI API

ga4gh.org

7 of 35

Beacon RI API

OMOP CDM

Relational database

We have hijacked the API to query the new database

ga4gh.org

8 of 35

Beacon RI & OMOP CDM

  • Implementation based on the RI-API
  • Query relational DB instead of a NoSQL
  • Individuals, Biosamples and Cohorts models from Beacon
  • Mapping between OMOP CDM and Beacon made by experts
  • No need of convert anything, SQL queries on-the-fly directly to database

ga4gh.org

9 of 35

Beacon RI & OMOP CDM

"Concussion with no loss of consciousness"

ga4gh.org

10 of 35

Next steps

  • More testing, with real data, for corner cases.
  • Beacon RI All-in-one: User can choose the type of database (MongoDB or OMOP CDM, for now).
  • More schemas: Genomic or Radiology extension of OMOP CDM? (talk on that later in the session)
  • Suggestions?

ga4gh.org

11 of 35

RNAget

Emilio Palumbo

ga4gh.org

12 of 35

Introduction

RNAget: an API to securely retrieve RNA quantifications

Bioinformatics, Volume 39, Issue 4, April 2023, btad126, https://doi.org/10.1093/bioinformatics/btad126

ga4gh.org

13 of 35

Motivation

  • lack of discovery, searching and filtering capabilities, e.g.:
    • filter expression matrix by expression values
    • look for datasets with a specific expression for a set of features
    • given a dataset, look for samples with a specific expression for a set of futures and extract metadata

use beacon framework

ga4gh.org

14 of 35

Motivation

  • adding context (e.g. samples and features metadata)
    • harmonised metadata model

use beacon model

ga4gh.org

15 of 35

Requirements

  • feature model
    • feature object
      • Can be gene, transcript, genomicLocation, etc.
      • Has attributes, e.g.: name, referenceAnnotation, type
    • get feature identifier by feature attributes (to be used in expression queries)
    • SequenceAnnotation group

ga4gh.org

16 of 35

Requirements

  • expression-specific filtering
    • define filter payload (extend one of the filteringTerms definitions or add a new one)
    • query multiple features by expression value
    • query by expression range

ga4gh.org

17 of 35

European Joint Program on Rare Diseases

Prof .Anthony J. Brookes

ga4gh.org

18 of 35

biobank

registry

Tools catalogs

Registries/�biobanks catalogs

Cell lines

Animal models

Data deposition

&analysis

platforms

Knowledge bases

Support for clinical/

translational research

Find

Query

Analyse

https://vp.ejprarediseases.org

19 of 35

GA4GH, IRDiRC, ELIXIR, BBMRI, . . .

  • DUC/CCE: Co-lead - digital structure for consent and use conditions
  • Beacon-2: Co-lead - API for federated querying and data sharing
  • OMOP: Use - common structure for health related data
  • Omics data: Use - file formats for high dimensionality data
  • CDE & DS-CDE: Develop - RD core & domain-specific data elements
  • Metadata: Develop – DCAT extensions for cohorts, data/samplesets, etc
  • Phenopackets: Use - validate and consume (Originator installing CV)
  • Security approaches: Use – ensure security of data and software

Standards Based (international)

20 of 35

Querying

= Implemented�

https://github.com/ejp-rd-vp/vp-api-specs

/catalogs endpoint

/individuals endpoint

21 of 35

• Range responses

• MinRange and maxRange employed in “info” within “results”

• With resultCount being the minRange

• Usage of “arrays of values” in filters, so that multiple terms can be queried

• ‘OR’ logic between terms

• Support partial queries

• Warning “info” within “results” to specify unsupported filters

• Introduction of auth-key in a custom header for security purposes

• Need improved semantic support in responses [e.g., application/json-LD]

Beacon ‘Enhancements’ required to enable EJP-RD

22 of 35

Beacon for Cancer Registries

Jordi Rambla

ga4gh.org

23 of 35

Beacon for Cancer Registries

Idea born in the EOSC4Cancer project

From screening & registries to clinical trials

ga4gh.org

24 of 35

Beacon for Cancer Registries - EOSC4Cancer

Prevention

Cancer origin

Diagnostics

Primary tumours

Treatment

Metastatic Cancers

Cancer registries

Environment

(pollutants)

Social data

Geolocalisation

Cancer Registries

Research Software for epidemiological level analysis

Software, Workflows and Portals for patient level clinical research

Clinical Decision Support Systems

Screening

programmes

Medical imaging

Medical data

(EHR structured

data)

Screening Programmes

Patient level:

Genomics data, Imagin (digital pathology), and EHR data (structured data)

Other research

data: liquid biopsy, animal models, drug screening

Non patient specific data: animal models, drug screening, etc

Cancer Research

Cancer clinical trials (descriptors - metadata)

Patient level genomics and medical data

Actionable data (regulated): markers, drugs, and treatments

Clinical trials

ga4gh.org

25 of 35

Beacon for Cancer Registries - ENCR

  • Cancer Registries have data at individual level
  • Cancer Registries publish statistic studies about cancer incidence in a given population… therefore, some aggregated/precomputed information is publicly available
  • Cancer Registries in Europe follow the European Network of Cancer Registries (ENCR) recommendations
  • These recommendations include a data model for sharing data

ga4gh.org

26 of 35

Beacon for Cancer Registries - Proof of Concept

  • We used the ENCR Model instead of Beacon v2
  • It is simple, just one table, includes data on:
    • Individual
    • Disease onset
    • Disease recurrent
    • Treatment
  • Attributes highly codified

ga4gh.org

27 of 35

Beacon for Cancer Registries

ga4gh.org

28 of 35

Beacon for Cancer Registries - Next steps

  • Discussed with intramural partners
  • Present and gather feedback from
    • Unveiled today - GA4GH Connect
    • EOSC4Cancer General Assembly - Next week
    • ENCR Annual meeting - Granada Oct’23
  • Analyze the feedback and draft a plan

ga4gh.org

29 of 35

Beacon for Images

Jordi Rambla

ga4gh.org

30 of 35

Beacon for Images (B4I)

  • Idea born in the IMPaCT-Data and EUCAIM projects
  • Imaging is a huge domain, B4I focuses on radiology

ga4gh.org

31 of 35

Beacon for Images (B4I) - Proof of Concept

  • We extended the Beacon v2 Model
  • Follows the OHDSI OMOP approach, looking at the radiology extension
  • Includes data on:
    • Individuals
    • Image acquisition and processing
    • Image features
  • Attributes highly codified in OMOP Dictionaries

ga4gh.org

32 of 35

Beacon for Images (B4I)

ga4gh.org

33 of 35

Beacon for Images (B4I)

  • Discussed with intramural partners
  • Present and gather feedback from
    • Unveiled today - GA4GH Connect
    • IMPaCT-Data General Assembly
    • EUCAIM
  • Analyze the feedback and draft a plan
  • Integrate the PoC into the OMOP Beacon

ga4gh.org

34 of 35

Wrap up!

Jordi Rambla

ga4gh.org

35 of 35

ga4gh.org