The GA4GH Phenopacket Schema
A computable representation of clinical data
�
Monica Munoz-Torres, PhD, Jules Jacobsen, PhD, Peter Robinson, MD, MS �- on behalf of all our coauthors.
Associate Professor, Department of Biomedical Informatics
Translational and Integrative Sciences Lab, Center for Health Artificial Intelligence
University of Colorado Anschutz Medical Campus
GA4GH
Bioinformatics Open Source Conference at Intelligent Systems for Molecular Biology | 13 July, 2022
monarchinitiative.org | @monimunozto | These slides: bit.ly/pp-ismb22
Global Alliance for Genomics & Health (GA4GH)
Aims to accelerate progress in genomic science and human health by developing standards �and framing policy for responsible genomic and health-related data sharing.
Standard exchange formats existed for �genome sequences but not for phenotypes
M. Munoz-Torres. BOSC at ISMB 2022.
Genes
Phenotypes
VCF
GFF
BED
We needed a standard way to share case-level phenotypic information -
not free text, a candidate diagnosis proxy, or full EHR data exported via PDF
PXF
Phenopacket
NEW
Phenopackets improve phenotype description
M. Munoz-Torres. BOSC at ISMB 2022.
How severe
are these?
Are some more severe than others?
When were
they first
observed?
Were they
NOT observed?
How are these
linked to
a patient?
What about
the parents
and siblings??
A Community Effort
Requirements and specifications were established with a community of researchers and clinicians.
Underwent a rigorous peer review and product approval process.
v 1.0 was released in 2019. And v 2.0 was developed on the basis of the feedback we received from the community; expanded the data model to include better representation of temporality, medical actions, and quantitative measures.
M. Munoz-Torres. BOSC at ISMB 2022.
The Phenopacket Schema
M. Munoz-Torres. BOSC at ISMB 2022.
Schema Definition
Formally defined using Google’s Protocol Buffers; protobuf3 - https://developers.google.com/protocol-buffers
M. Munoz-Torres. BOSC at ISMB 2022.
Phenopacket Schema Overview
M. Munoz-Torres. BOSC at ISMB 2022.
What’s in a Phenopacket?
M. Munoz-Torres. BOSC at ISMB 2022. Slides at bit.ly/pp-ismb22
https://phenopacket-schema.readthedocs.io/en/latest/phenopacket.html
Individual
Identifier, date of birth, age (time range), sex of the patient and their vital status – whether alive or not and, if deceased, the reason for their death.
M. Munoz-Torres. BOSC at ISMB 2022.
individual:
id: "patient:0"
dateOfBirth: "1937-03-01T00:00:00Z"
sex: "MALE"
vitalStatus:
status: "DECEASED"
timeOfDeath:
timestamp: "2019-10-06T10:54:20.021Z"
causeOfDeath:
id: "NCIT:C36263"
label: "Metastatic Malignant Neoplasm"
Phenotypic Features
Typically, characteristics which are more descriptive �than quantifiable such as ‘anosmia’, ‘fever’, and ‘dyspnea’
M. Munoz-Torres. BOSC at ISMB 2022.
g. evidence
Measurements
Quantitative and qualitative (yes/no, red/white/blue…) descriptions of a patient or biosample, with temporality (timestamp, time range, age, age range, ontology term)
M. Munoz-Torres. BOSC at ISMB 2022.
Medical Actions
Covers pharmaceutical treatments and surgical procedures, radiation therapy, and therapeutic regimens.
M. Munoz-Torres. BOSC at ISMB 2022.
Phenopackets and other clinical data standards
M. Munoz-Torres. BOSC at ISMB 2022.
Variation Representation Specification (VRS)
GA4GH
Documentation, Repo, Users, & Use Cases
M. Munoz-Torres. BOSC at ISMB 2022.
Phenotype data exchange �in the biomedical ecosystem
M. Munoz-Torres. BOSC at ISMB 2022.
Phenopackets can improve the speed and accuracy of diagnosis as well as treatment effectiveness
State-of-the-art of patient phenotyping
M. Munoz-Torres. BOSC at ISMB 2022.
Thank you!
The GAGH Phenopacket Modeling Consortium
Julius O. B. Jacobsen , Michael Baudis, Gareth S. Baynam , Jacques S. Beckmann , Sergi Beltran, Orion J. Buske, Tiffany J. Callahan, Christopher G. Chute , Mélanie Courtot , Daniel Danis , Olivier Elemento , Andrea Essenwanger, Robert R. Freimuth, Michael A. Gargano, Tudor Groza, Ada Hamosh , Nomi L. Harris , Rajaram Kaliyaperumal, Kevin C. Kent Lloyd , Aly Khalifa , Peter M. Krawitz, Sebastian Köhler, Brian J. Laraway, Heikki Lehväslaiho, Leslie Matalonga, Julie A. McMurry, Alejandro Metke-Jimenez, Christopher J. Mungall, Monica C. Munoz-Torres, Soichi Ogishima, Anastasios Papakonstantinou, Davide Piscia, Nikolas Pontikos, Núria Queralt-Rosinach, Marco Roos, Julian Sass, Paul N. Schofield , Dominik Seelow, Anastasios Siapos, Damian Smedley, Lindsay D. Smith, Robin Steinhaus , Jagadish Chandrabose Sundaramurthi , Emilia M. Swietlik, Sylvia Thun , Nicole A. Vasilevsky , Alex H. Wagner, Jeremy L. Warner, Claus Weiland , Melissa A. Haendel and Peter N. Robinson .
JOBJ, MCMT*, CGC, TG, AH, NLH, JAM, CJM, DS, NAV, MAH*, and PNR* are funded by NIH at NHGRI RM1 HG010860, OD R24OD011883, and *NLM 75N97019P00280.
M. Munoz-Torres. BOSC at ISMB 2022.
GA4GH