Advanced OBO Ontology Toolbox
Nicolas Matentzoglu @ ICBO 2023 Ontology Training
Managing the ontology life cycle
Building a knowledge base/classification that captures all relevant concepts of a domain / for a use case, and their relations to other domains in a scientifically accurate and logically consistent way.
The goal of ontology engineering
Logical axioms in ontologies relate entities from diverse ontologies
Hypolysinemia
=
decreased amount
lysine
part of
Quality
Entity
blood
ChEBI
PATO
decreased amount
lysine
blood
blood
?
How do I get terms from external ontologies to re-use them?
decreased amount
lysine
blood
How do I make sure that whenever I make a change, I didn’t break anything?
?
Developer
decreased amount
lysine
blood
How do I make sure all users can access the ontology in a standardised, FAIR manner?
hpo.owl
User
The ODK is a toolbox and ontology life-cycle management system
ODK image
OAK
make, bash
ROBOT
dosdp-tools
Reasoners
fastobo
Icon from www.flaticon.com
Toolbox
Workflow system
docker pull obolibrary/odkfull
blood
blood
Workflow: Dependency management
blood
imports
hp-edit.owl
uberon_import.owl
Continuous Integration Testing
Developer
edit locally
Make pull request
CI System (GH actions) runs ODK checks
Workflow: Release pipeline
hp-edit.owl
Release
Full
Base
Subsets
hp-full.owl
hp-full.obo
hp-full.json
Variants
Serialisations
Overview
Generate standard git repository
editors file
release files
imports
Social workflows:
CI/CD:
Executable workflows:
Acknowledgements
Core team
Funding:
Office of the Director, National Institutes of Health (R24-OD011883); National Human Genome Research Institute, ‘Phenomics First’ (RM1HG010860); National Institutes of Mental Health (1RF1MH123220-01); National Heart, Lung, and Blood Institute 5U01HG009453-03; UK Biotechnology and Biological Sciences Research Council/US National Science Foundation Directorate of Biological Sciences (BBSRC-NSF/BIO BB/T014008/1); The Wellcome Trust, ‘Virtual Fly Brain’ (105023MA); Director, Office of Science, Office of Basic Energy Sciences, of the US Department of Energy (DE-AC0205CH11231 to C.J.M.); European Molecular Biology Laboratory - European Bioinformatics Institute core funds
ROBOT
The Swiss Army Knife of the Ontology Engineer
https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-019-3002-3
Ontology Access Kit (OAK)
ICBO Ontology Training
2023
OAK: A Python library for ontology access
Command Line Interface: For everyone!!
Modular python packages: For developers and data scientists
50 multi-option commands
What can you do with OAK?
Basic Ontology Lookup
Graph-oriented Operations
Text and NLP
Ontology Associations
Plugins for:
OWL-oriented Operations*
Validation
Change
Semantic similarity
Ontology Subgraph Visualization
Customizable
JSON stylesheet
runoak -i cl.db viz -p i,p 'memory T-cell'
Compact representation of OWL TBox axioms (e.g. existential restrictions)
On becoming an
Open Science Engineer
ICBO 2023 - OBO Tutorial - Nicolas Matentzoglu
Open Science and Open Data are catalysts for tackling global problems
�Non-standardised data without explicit semantics is not really “open”, even if it is publicly available.
Ontologies play a central role in Open Science
Which are the top five technologies that will facilitate global open data integration in the next 5 to 10 years? (Answer by ChatGPT, with GPT-4 model)
To fulfill all of these roles ontologies must be community-driven
2009
To fulfill all of these roles ontologies must be community-driven
2016
To fulfill all of these roles ontologies must be community-driven*
2022
* incomplete list of contributors
Coordinating across sources is very difficult
I need these terms.
These terms do not make sense.
Is the nipple part of the mammary gland?
I would be hesitant to classify
Interstitium as an organ.
The Open Science Engineer
The Open Science Engineer contributes to the collection and standardisation of publicly available scientific knowledge through curation, community-coordination and data, ontology and software engineering.
3 Basic Practices for Community Coordination
Practice of Collaboration
Open Science Projects are heavily interlinked
RO
OMO
PHENIO
EFO
GWAS
Catalog
Practice of Upstream Fixing
Practice of No-Ownership
Tools all aspiring Open Science Engineers with a focus on semantics should know of
https://oboacademy.github.io/obook/reference/semantic-engineering-toolbox/
Prompt Engineering?
First steps to becoming an effective Open Science Engineer in Biomedical Ontologies
Today was a good start…
Join the community to tackle the global challenges of our time!
Thank you!