OntoChimp: A ChatGPT-Based Concept Extraction�Resource for Ontology Development
Samuel Smith
University of Michigan Medical School, Ann Arbor, MI, USA
Motivation and Challenge
• Developing a domain-specific ontology requires identifying a comprehensive, well-defined list of key concepts.
• Manual extraction is labor-intensive and inconsistent.
• OntoChimp provides a semi-automated, NLP-driven approach combining ChatGPT and ontology analysis tools.
Overview of the OntoChimp Process
Workflow 1 key concept extraction from a document (*) and structuring the data plus metadata.
Output: the OntoChimp TermTable, a structured dictionary for each key concept.
Workflow 2 utilizes BioPortal Annotator and other resources to categorize the key concepts.
(* term clusters are also extracted but not yet processed.)
Workflow 1: Key Concept Extraction
Inputs: 29 chapters of 'The Handbook of Solitude' (2021)
Process:
Output: 506 solitude-related key concepts.
Example: TermTable JSON Dictionary Output
The TermTable forms a bridge between Workflow 1 and Workflow 2
Overview of the OntoChimp Process
Workflow 1 key concept extraction from a document (*) and structuring the data plus metadata.
Output: the OntoChimp TermTable, a structured dictionary for each key concept.
Workflow 2 utilizes BioPortal Annotator and other resources to categorize the key concepts.
(* term clusters are also extracted but not yet processed.)
Workflow 2: Concept Categorization
Example: Categorization Table
Unexpected finding: A missing Concept (in an existing ontology):
OntoChimp, via BioPortal Annotator, identified that “loneliness” was not in the Emotion Ontology MFOEM.
We appreciate the MFOEM maintainers for adding “loneliness” and “feeling lonely” to MFOEM shortly after notification.
| Disposition | Matched | Onto. |
|
Key Concept | Code | Onto. | ID | Comment |
solitude | into-exact | PHASES |
| Term already planned for the Target Ontology (TO) |
intervention | inoo-incl | BCIO | BCIO_037000 | inoo in BCIO and selected as candidate for TO |
introversion | inoo-incl | MFOEM | MF_0000026 | inno in MFOEM and selected |
maternal behavior | inoo-incl | MFOEM | GO_0042711 | inno in MFOEM but ultimately from GO Gene Ontology |
public policy | inoo-excl | BCIO | ADDICTO_0001109 | inno but too general for use in TO |
problem solve | generic |
|
| valid for domain but too generic for onto. |
carl jung | scope |
| (example) | valid for domain but out of scope of onto; a person |
(none) | nval |
|
| invalid for domain (so far, no KC is invalid!) |
Observations & Early Results
Future Work
OntoChimp code is in development at: github.com/smithGit/OntoChimp/
Summary
Acknowledgements