Named
entities and LOD�
Around the Globe�in Eight Technologies
VeDPH Summer School 2022
Tiziana Mancinelli
authoritness and LOD
Named entities: what
Definition: The named entity concept involves all physical and real objects which can be designated with a specific name. A named entity recognition (NER) model detects entities in the text and classifies them (e.g., a person's name, a location name, an organization's name, a date).
Typology : persons, locations, organizations, stars . . (Tran, 2006 ; Ehrmann 2008)
Recognition and classification tasks (Nadeau and Sekine, 2007 ; Friburger and Maurel, 2004)
<person xml:id="Adam"> …..<persName>Adam</persName>
</person>
diex fist <persName ref="#Adam">adam</persName> le premier pere ne fu ou ques de
Reference is a fundamental semiotic and hermeneutical concept
Annotating a named entity: a person
There are many vocabularies to annotate people that also provides several ways of marking up names and nominal expressions
Named entities: TEI/XML
Names entities appear in most texts.
Why does the TEI have a module to describe them?
Because an entity (person, place, organisation) might be known by many names or might be referred to by some other description entirely.
TeiHeader:
<person xml:id="Adam"> …..<persName>Adam</persName>
</person>
Text:
diex fist <persName ref="#Adam">adam</persName> le premier pere ne fu ou ques de
people
places
Named entities: TEI/XML
<place xml:id="Armenia">
<placeName>Graunt Ermenie</placeName>
</place>
people
places
Places: gazetteer
gazetteer is a geographical index or directory used in conjunction with a map or atlas.[1][2] It typically contains information concerning the geographical makeup, social statistics and physical features of a country, region, or continent. Content of a gazetteer can include a subject's location, dimensions of peaks and waterways, population, gross domestic product and literacy rate. This information is generally divided into topics with entries listed in alphabetical order. (Wikipedia)
people
places
Evolution of the Web
“Words”: Global Entities
The evolution of digital paradigm
Document Centric
Data Centric
What are Linked (open) data?
Linked data is a way of publishing structured data based on open web technologies and standards such as HTTP, RDF (Resource Description Framework) and URI (Uniform Resource Identifier). If linked data links open data, it is called linked open data (LOD).
Let’s link the named entities to online resources
<xenoData>
<rdf:RDF>
<rdf:Description tei:ref="#Adam" rdf:about="http://dbpedia.org/resource/Adam">
<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/>
<rdfs:label xml:lang="en">Adam</rdfs:label>
</rdf:Description>
</rdf:RDF>
</xenoData>
What are Linked (open) data?
<?xml version="1.0"?>
<rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"
xmlns:foaf="http://xmlns.com/foaf/0.1/"
xmlns:dbo="http://dbpedia.org/ontology/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:tei="http://www.tei-c.org/ns/1.0">
<rdf:Description tei:ref="#Adam" rdf:about="http://dbpedia.org/resource/Adam">
<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/>
<rdfs:label xml:lang="en">Adam</rdfs:label>
</rdf:Description>
</rdf:RDF>
What are Linked (open) data?
The DPpedia
DBpedia (from "DB" for "database") is a project aiming to extract structured content from the information created in the Wikipedia project. This structured information is made available on the World Wide Web.[1] DBpedia allows users to semantically query relationships and properties of Wikipedia resources, including links to other related datasets.[2]
The Power�of the Network
Marco Polo
person
explorer of Asia
Marco Polo�@it
Поло, Марко @ru
rdfs:label
Venice
Venezia�@it
1254
rdf:type
foaf:Person
dbr:Marco_Polo
dbr:Venice
dbc:Explorers_of_Asia
Venedig�@de
rdfs:label
Venetian lagoon
dbr:Venetian_Lagoon
dbr:Ferdinand_Magellan
Magellan
rdf:type
rdf:type
Giulia Lama
dbr:Giulia_Lama
Alpi Eagles
dbr:Alpi_Eagles
dbo:hubAirport
Anthony Quinn
dbr:Anthony_Quinn
rdf:type
rdfs:label
rdfs:label
dbo:birthDate
dbo:birthPlace
dbo:nearestCity
dbo:deathPlace
In conclusion - it is scary!
But FUN!