A Simple Standard for Ontological Mappings 2023: Updates on data model, collaborations and tooling (OM 2023, Athens, Greece)
Anita Caron (EMBL-EBI), Benjamin Gyori (Harvard Medical School), Cassia Trojahn (Universite Toulouse 2), Charles Tapley Hoyt (Northeastern University), Chris Mungall (LBNL), Damien Goutte-Gattat (FlyBase), Emily Hartley (C-Path), Harshad Hegde (LBNL), Huanyu Li (Linköping University), Hyeongsik Kim (Bosch), Ian Braun (C-Path), James McLaughlin (EMBL-EBI), Nicolas Matentzoglu (Semanticly), Nicole Vasilevsky (CU Anschutz), Nomi Harris (LBNL), Sven Hertling (University of Mannheim)
* SSSOM can be pronounced “sessom”
https://w3id.org/sssom
Biomedical Data
Open SSSOM*!
I don’t think he is pronouncing it right…
What are entity mappings?
2
“Friedreich's Ataxia”
OMOP:441554
Entities are symbols, such as codes in a terminology, classes in an ontology, permissible values in a data model, identifiers in a database or simply strings in a text field that are intended to refer to a real world thing.
The anatomy of a semantic entity mapping
are insufficient
3
SUBJECT
PREDICATE
OBJECT
�subject_id:
EFO:10000070
�object_id:
MONDO:0006071�
�object_label:
adenofibroma�
�subject_label:
Adenofibroma�
�predicate_id:
skos:exactMatch
JUSTIFICATION
mapping_justification: semapv:LexicalMatching
subject_match_field: rdfs:label�object_match_field: oio:hasExactSynonym
match_string: adenofibroma
mapping_date: 2022-12-13
reviewer_id: orcid:0000-0002-7356-1779
mapping_tool: wikidata:Q64360017
confidence: 0.8
Recap: What problem do we solve? �Provide a simple, spreadsheet-based format to facilitate widespread adoption
4
4
#mapping_set_id: https://w3id.org/sssom/commons/mouse-human/mappings/mp_hp_mgi_all.sssom.tsv
#mapping_set_title: All mappings of MP terms to HPO terms generated by MGI
#mapping_set_description: "Consolidated list of all HPO to MP mappings done by MGI…."
#creator_id:
# - orcid:0000-0003-4606-0597
# - ror:021sy4w91
#license: https://creativecommons.org/licenses/by/4.0/
#object_source: obo:hp
#subject_source: obo:mp
#curie_map:
# HP: http://purl.obolibrary.org/obo/HP_
# MP: http://purl.obolibrary.org/obo/MP_
Mapping Table
Recap: What problem do we solve? �Document rich mapping justifications to facilitate well-informed re-use decisions across use cases
5
:A
:B
skos:exactMatch
Lexical matching
“C”
rdfs:label
skos:prefLabel
subject_match_field
object_match_field
match_string
mapping_justification
Manual mapping curation
mapping_justification
orcid:123
author_id
confidence
0.7
Other examples of justifications:
subject_preprocessing
semapv:CaseNormalization
lexmatch
mapping_tool
https://mapping-commons.github.io/semantic-mapping-vocabulary/
Recap: What problem do we solve? �A well defined data model for mappings and justifications
Rich YAML schema powered by
Shex shapes for validating rdf
JSON Schema
Markdown docs
https://w3id.org/sssom/spec
Recap: What problem do we solve? �Promoting the creation of interoperable FAIR mapping registries
7
m1.sssom.tsv
m2.sssom.tsv
m3.sssom.tsv
b.sssom.tsv
Registry
Shared QC, � automatic reconciliation
Wrong mapping!
Collaborative curation
Updates 2023: SSSOM Model and Documentation
8
Curation rules: Capture a (potentially) complex (set of) condition(s) executed by an agent (usually human) that led to the establishment of a mapping.
9
:A
:B
skos:exactMatch
Manual mapping curation
mapping_justification
orcid:123
author_id
confidence
0.7
curation_rule
DISEASE_MAPPING_COMMONS_RULES:MPR3
“Two diseases are considered exact matches if they share both phenotypic presentation and genetic underpinnings.”
Updates 2023: Tooling
10
Open Mapping Justification widget
Open Mapping page
Open Mappings page
https://github.com/EBISPOT/oxo2
11
Updates 2023: Tooling (OxO 2)
Updates 2023: SSSOM @ OAEI
12
Updates 2023: User Radar (selected)
13
Discussion 2023: What about other types of entity mappings?
14
MONDO:0006071
Type 1: lexical token - identifier
Type 2: identifier - identifier
Type 3: complex
EFO:1000070
MONDO:0006071
adenofibroma
SSSOM 2023 Workshop: The Limits of SSSOM.
Hypertensive heart disease without congestive heart failure
modifies
Not
Congestive heart failure
AND
Hypertensive heart disease
Acknowledgements (SSSOM Work)
15
Funding
Phenomics First (NIH / NHGRI #1RM1HG010860-01): Spec, Mondo integration, sssom-py CLI��Monarch (NIH / OD #5R24OD011883): Cross-species mappings, outreach, knowledge graph integration
Bosch Gift to LBNL: sssom-py IO, testing, converters, tutorials
DARPA: Young Faculty Award W911NF2010255�(PI: Benjamin M. Gyori)
�
Community contributions: https://w3id.org/sssom
Core Team, alphabetical order (https://github.com/orgs/mapping-commons/teams/sssom-core)
Database (Oxford), Volume 2022, baac035, https://doi.org/10.1093/database/baac035
*recent joiners highlighted in bold