A Simple Standard for Sharing Ontology Mappings (SSSOM)
Matentzoglu, Nicolas; Balhoff, James P.; Callahan, Tiffany; Chute, Christopher; Dahzi, Jiao; Duncan, William; Gabriel, Davera; Graybeal, John; Haendel, Melissa; Harmse, Henriette; Harold, Solbrig; Harris, Nomi; Hegde, Harshad; Hoyt, Charles Tapley; Jimenez-Ruiz, Ernesto; Jupp, Simon; Kim, Hyeongsik; Koehler, Sebastian; Liener, Thomas; Malone, James; McLaughlin, James; Munoz-Torres, Monica; Osumi-Sutherland, David; Overton, James; Thessen, Anne; Vasilevsky, Nicole; Mungall, Chris
WSBO - 14.07.2021
https://fairsharing.org/bsg-s001618
A plaidoyer for caring more about the standardisation and dissemination
of ontology mappings.
What is a mapping (in the sense of this talk)?
MONDO:0006071
“adenofibroma”
MONDO:0006071
EFO:1000070
UBERON:0000483
"epithelium (drosophila)" �FBbt:00007005
NCBITaxon:7227
Type 1: string - term
Type 2: term - term
Type 3: complex
Convergence vs Mapping
How are mappings curated and used in practice?
Mappings are essential to bridge semantic spaces, but they are hard to use.. and share.
Non-transparent imprecision: Mappings are rarely exact equivalents.. but we often don’t know that
O2:Alzheimer’s
O1:Alzheimer’s 2
O3:Alzheimer’s 3
(broad)
(narrow)
Inaccurate and incomplete: The consequence of relying on tools or mass manual review
An overview of mapping systems in the biomedical domain (excerpt)
Mappings are pretty unFAIR
prov:stolen_from
_:b
Mappings could be:�
7
Making mappings FAIR and Open: A Simple Standard for Sharing Ontology Mappings (SSSOM)
mapping / correspondence
mapping set
mapping server
mapping commons
derives
makes available & trusts
makes available & trusts
maintains
part of
derived �mapping set
partial �mapping set
alignment
https://fairsharing.org/bsg-s001618
Mapping model
match_type
HumanCurated
confidence
0.9
match_type
HumanCurated
subject_id
EFO:1000070
object_id
MONDO:0006071
predicate_id
skos:exactMatch
match_type
HumanCurated
subject_label
Adenofibroma
object_label
adenofibroma
mapping_date
29.05.2021
orcid:0001-.....
subject_match_field
label
The SSSOM metadata model
Rich YAML schema powered by
Shex shapes for validating rdf
JSON Schema
Markdown docs
- subject_id
- subject_label
- subject_category
- predicate_id
- predicate_label
- object_id
- object_label
- object_category
- match_type
- creator_id
- creator_label
- license
- subject_source
- subject_source_version
- object_source
- object_source_version
- mapping_provider
- mapping_cardinality
- mapping_tool
- mapping_date
- confidence
- subject_match_field
- object_match_field
- match_string
- subject_preprocessing
- object_preprocessing
- match_term_type
- semantic_similarity_score
- see_also
- other
- comment
https://github.com/mapping-commons/sssom-py/tree/master/schema
Example SSSOM tsv file
Metadata header with curie map, and various mapping set level metadata fields.
Actual mappings with metadata in TSV form
pd.read_csv(f,comment=”#”, sep=”\t”)
sssom-py: a python toolkit and CLI to process SSSOM files
obographs-json
alignment-api-xml
alignment-api-xml
import
import
sssom-tsv
sssom-tsv
export
sssom-tsv
merge
merge
filter
https://github.com/mapping-commons/sssom-py
Example uses
A simple guide to make your mappings FAIR.. and open
Takeaways
Join us for the next SSSOM user meetup on 3rd September..
Acknowledgements:
Phenomics First (NIH / NHGRI #1RM1HG010860-01): Spec, Mondo integration, sssom-py CLI
Bosch Gift to LBNL: sssom-py IO, testing, converters
Mapping community: Lots of volunteering contributions, e.g. Charlie Hoyt (sssom-py), Pistoia Alliance (Thomas Liener), John Graybeal (spec), James McLaughlin (infrastructure), ...
Core team: https://w3id.org/sssom/SSSOM.md