NCSA faculty fellowship w/iSchool on turning free-text into Knowledge-Graph triples
Mike Bobak
NCSA faculty fellowship w/iSchool 2021-2022
I worked on:
incl some in braat format to more easily view the parse/relationships within the sentences
or
Move services either to REST based calls, or to local execution.
Motivation: of machine interpretability of knowledge from free-text
Things-not-strings via: free-text -to-> Knowledge-Graph triples (entities w/relationships)
helps achieve achieve the goal of machine-interpretability [KGs need connected things]
blog.google/products/search/introducing-knowledge-graph-things-not
Introducing the Knowledge Graph:
things, not strings
1. Find the right thing Language can be ambiguous
2. Get the best summary With the Knowledge Graph, Google can better understand your query
3. Go deeper and broader
Finally, the part that’s the most fun of all—the Knowledge Graph can help you make some unexpected discoveries.
There are several application areas for
machine interpretable knowledge
e.g.
Named-Entity-Recognition & Linking
wikipedia.org/wiki/Capital_city_of
Knowledge-Graph triples are made of URI/things,
w/some literal objects
wikipedia.org/wiki/France
wikipedia.org/wiki/Capital_city
wikipedia.org/wiki/Paris
literals are eg. text numbers, or any xml type; but can only be in terminal Objects
dbp:Paris dbp:Population 2161000^^xsd:int
We use MetaMap-Lite for Entity-Linking
How it works:
Example MML match:
"Papillary Thyroid Carcinoma is a Unique Clinical Entity"
"Papillary Thyroid Carcinoma is a Unique Clinical"
"Papillary Thyroid Carcinoma is a Unique"
"Papillary Thyroid Carcinoma is a"
"Papillary Thyroid Carcinoma is"
"Papillary Thyroid Carcinoma" --> match
"is a Unique Clinical Entity"
"is a Unique Clinical"
"is a Unique"
"is a"
"is"
"a Unique Clinical Entity"
"a Unique Clinical"
"a Unique"
"a"
"Unique Clinical Entity"
"Unique Clinical"
"Unique" --> match
"Clinical Entity"
"Clinical" --> match
"Entity" --> match
Entity Linking output to the brat rapid annotation tool
Expanding Beyond BioMedical domain
Ontologies with predicate hasExactSynonym,
w/literal objects being that text that can be harvested
to make MML handle new domains.
I plan to use it for GeoCODES, & can think of many others it could be used in
incl some in braat to more easily view the parse/relationships within the sentences
https://isda.ncsa.illinois.edu/~mbobak/
for February-June:
after this, extra slides, this is just a very rough, 1st draft
Clowder is mentioned in the NIH grant proposal &I will annotate this EC free-text too
Clowder organization
Allows for
Clowder search results
& a result’s metadata(tab) tree listing
Future work:
Faster time to science
via metadata use
to get more
resources
Can take questions later: @Mike Bobak