1 of 116

Graph data formats:�RDF, RDFS, Linked Data

Jakub Klímek

2 of 116

2

3 of 116

Graph data representation

There is a thing, which is a catalog

3

The catalog has title “my catalog”

The catalog has description�“my first testing catalog”

The catalog has homepage “https://mycatalog.example.org/homepage”

4 of 116

Graph data representation

There is a thing, which is a catalog

4

The catalog has title “my catalog”

The catalog has description“my first testing catalog”

The catalog has homepage https://mycatalog.example.org/homepage

5 of 116

Graph data representation

There is a thing, which is a catalog

5

Catalog

which is a

6 of 116

Graph data representation

There is a thing, which is a catalog

6

The catalog has title “my catalog”

Catalog

"my catalog"

which is a

has title

7 of 116

Graph data representation

There is a thing, which is a catalog

7

7

The catalog has title “my catalog”

The catalog has description“my first testing catalog”

Catalog

"my catalog"

which is a

has title

"my first testing catalog"

has description

8 of 116

Graph data representation

There is a thing, which is a catalog

8

8

The catalog has title “my catalog”

The catalog has description“my first testing catalog”

The catalog has homepage “https://mycatalog.example.org/homepage”

Catalog

"my catalog"

https://mycatalog.example.org/homepage

which is a

has title

has homepage

"my first testing catalog"

has description

9 of 116

Graph data representation

There is a thing, which is a catalog

9

9

The catalog has title “my catalog”

The catalog has description“my first testing catalog”

The catalog has homepage “https://mycatalog.example.org/homepage”

Catalog

"my catalog"

https://mycatalog.example.org/homepage

which is a

has title

has homepage

"my first testing catalog"

has description

Page

which is a

The homepage is a page

10 of 116

Graph data representation

There is a thing, which is a catalog

10

The catalog has title “my catalog”

The catalog has description“my first testing catalog”

The catalog has homepage “https://mycatalog.example.org/homepage”

https://mycatalog.example.org/catalog

Catalog

"my catalog"

https://mycatalog.example.org/homepage

which is a

has title

"my first testing catalog"

has description

has homepage

Page

which is a

The homepage is a page

11 of 116

Graph data representation

There is a thing, which is a catalog

11

The catalog has title “my catalog”

The catalog has description“my first testing catalog”

The catalog has homepage “https://mycatalog.example.org/homepage”

https://mycatalog.example.org/catalog

http://www.w3.org/ns/dcat#Catalog

"my catalog"

https://mycatalog.example.org/homepage

which is a

has title

"my first testing catalog"

has description

The homepage is a page

has homepage

Page

which is a

12 of 116

Graph data representation

There is a thing, which is a catalog

12

The catalog has title “my catalog”

The catalog has description“my first testing catalog”

The catalog has homepage “https://mycatalog.example.org/homepage”

https://mycatalog.example.org/catalog

http://www.w3.org/ns/dcat#Catalog

"my catalog"

https://mycatalog.example.org/homepage

which is a

http://purl.org/dc/terms/title

"my first testing catalog"

has description

has homepage

Page

which is a

The homepage is a page

13 of 116

Graph data representation

There is a thing, which is a catalog

13

The catalog has title “my catalog”

The catalog has description“my first testing catalog”

The catalog has homepage “https://mycatalog.example.org/homepage”

https://mycatalog.example.org/catalog

http://www.w3.org/ns/dcat#Catalog

"my catalog"

https://mycatalog.example.org/homepage

which is a

http://purl.org/dc/terms/title

"my first testing catalog"

http://purl.org/dc/terms/description

The homepage is a page

has homepage

Page

which is a

14 of 116

Graph data representation

There is a thing, which is a catalog

14

The catalog has title “my catalog”

The catalog has description“my first testing catalog”

The catalog has homepage “https://mycatalog.example.org/homepage”

https://mycatalog.example.org/catalog

http://www.w3.org/ns/dcat#Catalog

"my catalog"

https://mycatalog.example.org/homepage

which is a

http://purl.org/dc/terms/title

"my first testing catalog"

http://purl.org/dc/terms/description

The homepage is a page

http://xmlns.com/foaf/0.1/homepage

Page

which is a

15 of 116

Graph data representation

There is a thing, which is a catalog

15

The catalog has title “my catalog”

The catalog has description“my first testing catalog”

The catalog has homepage “https://mycatalog.example.org/homepage”

https://mycatalog.example.org/catalog

http://www.w3.org/ns/dcat#Catalog

"my catalog"

https://mycatalog.example.org/homepage

http://www.w3.org/1999/02/22-rdf-syntax-ns#type

http://purl.org/dc/terms/title

"my first testing catalog"

http://purl.org/dc/terms/description

The homepage is a page

http://xmlns.com/foaf/0.1/homepage

Page

http://www.w3.org/1999/02/22-rdf-syntax-ns#type

16 of 116

Graph data representation

There is a thing, which is a catalog

16

The catalog has title “my catalog”

The catalog has description“my first testing catalog”

The catalog has homepage “https://mycatalog.example.org/homepage”

https://mycatalog.example.org/catalog

http://www.w3.org/ns/dcat#Catalog

"my catalog"

https://mycatalog.example.org/homepage

http://www.w3.org/1999/02/22-rdf-syntax-ns#type

http://purl.org/dc/terms/title

"my first testing catalog"

http://purl.org/dc/terms/description

The homepage is a page

http://xmlns.com/foaf/0.1/homepage

http://xmlns.com/foaf/0.1/Page

http://www.w3.org/1999/02/22-rdf-syntax-ns#type

17 of 116

Graph data representation

There is a thing, which is a catalog

17

The catalog has title “my catalog”

The catalog has description“my first testing catalog”

The catalog has homepage “https://mycatalog.example.org/homepage”

ex:catalog

dcat:Catalog

"my catalog"

ex:homepage

rdf:type

dcterms:title

"my first testing catalog"

dcterms:description

The homepage is a page

foaf:homepage

foaf:Page

rdf:type

a statement, a triple

18 of 116

Graph data representation

ex:catalog rdf:type dcat:Catalog .

18

The catalog has title “my catalog”

The catalog has description“my first testing catalog”

The catalog has homepage “https://mycatalog.example.org/homepage”

ex:catalog

dcat:Catalog

"my catalog"

ex:homepage

rdf:type

dcterms:title

"my first testing catalog"

dcterms:description

The homepage is a page

foaf:homepage

foaf:Page

rdf:type

a statement, a triple

19 of 116

Graph data representation

ex:catalog rdf:type dcat:Catalog .

19

ex:catalog dcterms:title "my catalog" .

The catalog has description“my first testing catalog”

The catalog has homepage “https://mycatalog.example.org/homepage”

ex:catalog

dcat:Catalog

"my catalog"

ex:homepage

rdf:type

dcterms:title

"my first testing catalog"

dcterms:description

The homepage is a page

foaf:homepage

foaf:Page

rdf:type

20 of 116

Graph data representation

ex:catalog rdf:type dcat:Catalog .

20

ex:catalog dcterms:description"my first testing catalog" .

The catalog has homepage “https://mycatalog.example.org/homepage”

ex:catalog

dcat:Catalog

"my catalog"

ex:homepage

rdf:type

dcterms:title

"my first testing catalog"

dcterms:description

The homepage is a page

foaf:homepage

foaf:Page

rdf:type

ex:catalog dcterms:title "my catalog" .

21 of 116

Graph data representation

ex:catalog rdf:type dcat:Catalog .

21

ex:catalog dcterms:description"my first testing catalog" .

ex:catalog foaf:homepage ex:homepage .

ex:catalog

dcat:Catalog

"my catalog"

ex:homepage

rdf:type

dcterms:title

"my first testing catalog"

dcterms:description

The homepage is a page

foaf:homepage

foaf:Page

rdf:type

ex:catalog dcterms:title "my catalog" .

22 of 116

Graph data representation

ex:catalog rdf:type dcat:Catalog .

22

ex:catalog dcterms:description"my first testing catalog" .

ex:catalog foaf:homepage ex:homepage .

ex:catalog

dcat:Catalog

"my catalog"

ex:homepage

rdf:type

dcterms:title

"my first testing catalog"

dcterms:description

ex:homepage rdf:type foaf:Page .

foaf:homepage

foaf:Page

rdf:type

ex:catalog dcterms:title "my catalog" .

23 of 116

Graph data representation

ex:catalog rdf:type dcat:Catalog .�ex:catalog dcterms:title "my catalog" .�ex:catalog dcterms:description "my first testing catalog" .�ex:catalog foaf:homepage ex:homepage .�ex:homepage rdf:type foaf:Page .

There is a thing, which is a catalog

23

The catalog has title “my catalog”

The catalog has description“my first testing catalog”

The catalog has homepage “https://mycatalog.example.org/homepage”

The homepage is a page

24 of 116

RDF

24

25 of 116

RDF - Resource Description Framework - idea

RDF - graph based data model - �a set of triples

Triple describes a relation as:

subject predicate object

2004 & 2014 W3C Recommendations

Triples are written in one of RDF notations / syntaxes / serializations:

RDF/XML, RDFa, N-Triples, Turtle, JSON-LD, N-Quads, TriG

25

Jakub Klímek studied at Charles University .

predicate

object

subject

Jakub Klímek Charles University

studied at

26 of 116

RDF model: a triple, a statement

<http://example.com/index.html> <http://purl.org/dc/terms/creator> <http://example.com/staff/8574> .

26

http://example.com/index.html

http://example.com/staff/8574

http://purl.org/dc/terms/creator

subject (S)

predicate (P)�(property)

object (O)

Resource / Thing

Resource / Thing

27 of 116

RDF model: a triple with literal value

<http://example.com/index.html> <http://purl.org/dc/terms/subject> "education" .

27

http://example.com/index.html

education

http://purl.org/dc/terms/subject

subject (S)

predicate (P)

object (O)

Resource

Literal

28 of 116

RDF serializations: IRIs and IRI prefixes

<http://purl.org/dc/terms/creator>

=

@prefix dcterms: <http://purl.org/dc/terms/> .

dcterms:creator

28

29 of 116

RDF model: multiple properties

my:index.html dcterms:creator exstaff:85740 .�my:index.html dcterms:subject "education" .�my:index.html dcterms:language "en" .

29

my:index.html

my:staff/85740

dcterms:creator

education

en

dcterms:language

dcterms:subject

a set�i.e. no ordering among triples

30 of 116

RDF model: typed literals

my:index.html dcterms:created "2020-04-23"^^xsd:date .

30

my:index.html

"2020-04-23"^^<http://www.w3.org/2001/XMLSchema#date>

dcterms:created

31 of 116

RDF model: text literals with a language tag

my:index.html dcterms:title "Homepage of Jakub Klímek"@en .

31

my:index.html

"Homepage of Jakub Klímek"@en

dcterms:title

32 of 116

RDF model: classes

32

my:Person

my:staff/85740

rdf:type

Resource

Resource

Class

 

33 of 116

RDF model: blank nodes

my:staff/85740 my:hasAddress _:a1 .

_:a1 my:street "Malostranske nam. 25" .�_:a1 my:city "Prague" .�_:a1 my:zipCode "11800" .

33

my:street

Prague

11800

my:zipCode

my:city

my:staff/85740

my:hasAddress

Malostranské nám. 25

34 of 116

RDF - Resource Description Framework

  • 1.0: W3C Recommendation
    • 10 February 2004
  • 1.1: W3C Recommendation
    • 25 February 2014
  • 1.2: Working draft
    • 22 August 2024
  • Graph data model
    • Directed labeled multigraph
      • Vertices for subjects and objects
      • Labeled edges for particular triples

34

35 of 116

RDF serializations

35

36 of 116

RDF 1.1 N-Triples

  • W3C Recommendation
    • 25 February 2014

36

<http://one.example/subject1> <http://one.example/predicate1> <http://one.example/object1> . # comments here

<http://example.org/show/218> <http://example.org/show/localName> "That Seventies Show"@en . # literal with a language tag

�<http://en.wikipedia.org/wiki/Helium> <http://example.org/elements/atomicNumber> "2"^^<http://www.w3.org/2001/XMLSchema#integer> . # xsd:integer

37 of 116

RDF 1.1 N-Triples

<http://example.com/index.html> <http://purl.org/dc/terms/created> "2020-04-23"^^<http://…#date> .

<http://example.com/index.html> <http://purl.org/dc/terms/creator> <http://example.com/staff/8574> .

<http://example.com/index.html> <http://purl.org/dc/terms/creator> <http://example.com/staff/8575> .

<http://example.com/index.html> <http://purl.org/dc/terms/title> "Moje stránka"@cs .

<http://example.com/index.html> <http://purl.org/dc/terms/title> "My page"@en .

37

38 of 116

RDF 1.1 Turtle - Prefixes

<http://example.com/index.html> <http://purl.org/dc/terms/created> "2020-04-23"^^<http://…#date> .

<http://example.com/index.html> <http://purl.org/dc/terms/creator> <http://example.com/staff/8574> .

<http://example.com/index.html> <http://purl.org/dc/terms/creator> <http://example.com/staff/8575> .

<http://example.com/index.html> <http://purl.org/dc/terms/title> "Moje stránka"@cs .

<http://example.com/index.html> <http://purl.org/dc/terms/title> "My page"@en .

38

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .�@prefix dcterms: <http://purl.org/dc/terms/> .�@prefix my: <http://example.com/> .�@prefix staff: <http://example.com/staff/> .��my:index.html dcterms:created "2020-04-23"^^xsd:date .

my:index.html dcterms:creator staff:8574 .

my:index.html dcterms:creator staff:8575 .

my:index.html dcterms:title "Moje stránka"@cs .

my:index.html dcterms:title "My page"@en .

39 of 116

RDF 1.1 Turtle - Prefixes and ";"

<http://example.com/index.html> <http://purl.org/dc/terms/created> "2020-04-23"^^<http://…#date> .

<http://example.com/index.html> <http://purl.org/dc/terms/creator> <http://example.com/staff/8574> .

<http://example.com/index.html> <http://purl.org/dc/terms/creator> <http://example.com/staff/8575> .

<http://example.com/index.html> <http://purl.org/dc/terms/title> "Moje stránka"@cs .

<http://example.com/index.html> <http://purl.org/dc/terms/title> "My page"@en .

39

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .�@prefix dcterms: <http://purl.org/dc/terms/> .�@prefix my: <http://example.com/> .�@prefix staff: <http://example.com/staff/> .��my:index.html dcterms:created "2020-04-23"^^xsd:date ;

dcterms:creator staff:8574 ;

dcterms:creator staff:8575 ;

dcterms:title "Moje stránka"@cs ;

dcterms:title "My page"@en .

40 of 116

RDF 1.1 Turtle - Prefixes and ";" and ","

<http://example.com/index.html> <http://purl.org/dc/terms/created> "2020-04-23"^^<http://…#date> .

<http://example.com/index.html> <http://purl.org/dc/terms/creator> <http://example.com/staff/8574> .

<http://example.com/index.html> <http://purl.org/dc/terms/creator> <http://example.com/staff/8575> .

<http://example.com/index.html> <http://purl.org/dc/terms/title> "Moje stránka"@cs .

<http://example.com/index.html> <http://purl.org/dc/terms/title> "My page"@en .

40

@prefix xsd: <http://www.w3.org/2001/XMLSchema#> .�@prefix dcterms: <http://purl.org/dc/terms/> .�@prefix my: <http://example.com/> .�@prefix staff: <http://example.com/staff/> .��my:index.html dcterms:created "2020-04-23"^^xsd:date ;

dcterms:creator staff:8574 ,

staff:8575 ;

dcterms:title "Moje stránka"@cs ,

"My page"@en .

41 of 116

RDF 1.1 Turtle - More prefixes

@prefix foo: <http://example.org/ns#> .�@prefix : <http://other.example.org/ns#> .

foo:bar foo: : .�:bar : foo:bar .

<http://example.org/ns#bar> <http://example.org/ns#> <http://other.example.org/ns#> .

<http://other.example.org/ns#bar> <http://other.example.org/ns#> <http://example.org/ns#bar> .

41

42 of 116

RDF 1.1 Turtle - Relative IRIs

  • </path>
  • <#fragment>
  • <>

Need to know, relative to WHAT the IRI is.

  • Implicitly, a document URL (if known)
  • Explicitly using @base

Assuming Document URL https://test.org/doc��@prefix foo: <http://example.org/ns#> .��<#document> foo: <https://jk.com> .��@base <http://newbase.com/> .�<#document> foo: <https://jk.com> .

<https://test.org/doc#document> <http://example.org/ns#> <https://jk.com> .

<http://newbase.com/#document> <http://example.org/ns#> <https://jk.com> .

42

43 of 116

RDF 1.1 Turtle - Multiline strings, escapes

"""a string�with newlines�"""

  • \t (U+0009, tab)
  • \n (U+000A, linefeed)
  • \r (U+000D, carriage return)
  • \" (U+0022, double quote - only allowed inside strings)
  • \> (U+003E, greater than - only allowed inside URIs)
  • \\ (U+005C, backslash)
  • \uHHHH or \UHHHHHHHH for writing Unicode characters by hexadecimal codepoint where H is a single hexadecimal digit.

43

44 of 116

RDF 1.1 Turtle - Class assignment (rdf:type)

<http://example.com/index.html> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> foaf:Document .

=

<http://example.com/index.html> a foaf:Document .

44

45 of 116

RDF 1.1 Turtle - blank nodes

<http://example.com/> a v:VCard ;� v:adr [

a v:Work ;

v:country-name "Australia" ;

v:locality "WonderCity" ;

v:postal-code "5555" ;

v:street-address "33 Enterprise Drive"

] ;

.

<http://example.com/> v:adr _:1234 .

_:1234 a v:Work ;

v:locality "WonderCity" ;

...

45

=

46 of 116

RDF 1.1 Turtle - datatype shortcuts

ex:Car ex:numberOfWheels 4 ;

ex:Car ex:numberOfWheels +4 ;

ex:Car ex:numberOfWheels "4"^^xsd:integer ;

ex:Car ex:value 1300000.0 ;

ex:Car ex:value "1300000.0"^^xsd:decimal ;

ex:Car ex:value 1.3e6 ;

ex:Car ex:value "1.3e6"^^xsd:double ;

ex:Car ex:leftHandDrive true ;

ex:Car ex:leftHandDrive "true"^^xsd:boolean ;

46

47 of 116

RDF 1.1 Turtle - playing with prefixes and rel. IRIs

# In-scope base IRI is the document URI at this point�<a1> <b1> <c1> .�@base <http://example.org/ns/> .��# In-scope base IRI is http://example.org/ns/ at this point�<a2> <http://example.org/ns/b2> <c2> .�@base <foo/> .��# In-scope base IRI is http://example.org/ns/foo/ at this point�<a3> <b3> <c3> .�@prefix : <bar#> .�:a4 :b4 :c4 .�@prefix : <http://example.org/ns2#> .�:a5 :b5 :c5 .

47

http://example.org/ns/c2

http://example.org/ns/foo/

http://example.org/ns/foo/c3

http://example.org/ns/foo/bar#

http://example.org/ns/foo/bar#c4

http://example.org/ns2#c5

48 of 116

Detour from RDF serializations

48

49 of 116

RDF model - statements about statements

my:index.html my:createdBy "Jakub Klímek" .

49

This statement

  • came from https://x.y.z
  • was scraped on 2020-04-23

How to represent these facts in RDF?

50 of 116

RDF model - reification

Direct approach to the problem:

Statement will become a resource.

  • Assign IRI to the statement itself, or
  • make it a blank node (e.g. _:triple1)

Original statement:

my:index.html my:createdBy "Jakub Klímek" .

Reified statement:

_:triple1 a rdf:Statement .

_:triple1 rdf:subject my:index.html .

_:triple1 rdf:predicate my:createdBy .

_:triple1 rdf:object "Jakub Klímek" .

50

Possibility of additional statements we need:

_:triple1 dcterms:created "2020-04-23"^^xsd:date .

51 of 116

RDF model - named graphs

Alternative approach to the problem:

  • Statements belong to “named graphs”
  • Named graphs are resources
  • We can state facts about resources

RDF Triples become RDF Quads

  • S P O G
  • G can be used as subject of another triple (quad)

51

my:index.html

my:staff/85740

dcterms:creator

education

en

dcterms:language

dcterms:subject

my:index.html

my:staff/85740

dcterms:creator

education

en

dcterms:language

dcterms:subject

my:index.html

my:staff/85740

dcterms:creator

education

en

dcterms:language

dcterms:subject

https://example.org/named-graphs/1

52 of 116

RDF dataset

Consists of

  • set of named graphs
    • identified by IRIs
  • one default graph

52

my:index.html

my:staff/85740

dcterms:creator

education

en

dcterms:language

dcterms:subject

my:index.html

my:staff/85740

dcterms:creator

education

en

dcterms:language

dcterms:subject

my:index.html

my:staff/85740

dcterms:creator

education

en

dcterms:language

dcterms:subject

https://example.org/named-graphs/1

https://example.org/named-graphs/2

default graph

53 of 116

Back to RDF serializations

53

54 of 116

RDF 1.1 N-Quads

  • W3C Recommendation
    • 25 February 2014
  • Based on N-Triples, adds support for named graphs

S P O G

<http://example.org/#spiderman> <http://www.perceive.net/schemas/relationship/enemyOf> <http://example.org/#green-goblin> <http://example.org/graphs/spiderman> .

54

55 of 116

RDF 1.1 TriG

  • W3C Recommendation
    • 25 February 2014
  • Based on RDF Turtle, adds support for named graphs

RDF Dataset consists of

  • 1 default graph
  • N named graphs

@base <http://www.w3.org/People/> .

@prefix : <http://xmlns.com/foaf/0.1/> .

# default graph

{

ericFoaf:ericP :givenName "Eric" .

}

# also default graph, no {}

ericFoaf:ericP :givenName "Eric" .

# graph highlight

GRAPH <Eric/ericP-foaf.rdf> {

ericFoaf:ericP :givenName "Eric" .

}

55

56 of 116

RDFS: RDF Schema

56

57 of 116

RDFS - RDF Schema 1.1

W3C Recommendation

  • 25 February 2014

Vocabulary for creation of other RDF vocabularies

RDF Vocabulary

  • collection of classes and properties, their IRIs and their definitions

57

58 of 116

RDFS - defining classes and class hierarchies

ex:MotorVehicle rdf:type rdfs:Class .

ex:PassengerVehicle rdf:type rdfs:Class .

ex:Van rdf:type rdfs:Class .

ex:Truck rdf:type rdfs:Class .

ex:MiniVan rdf:type rdfs:Class .

ex:PassengerVehicle rdfs:subClassOf ex:MotorVehicle .

ex:Van rdfs:subClassOf ex:MotorVehicle .

ex:Truck rdfs:subClassOf ex:MotorVehicle .

ex:MiniVan rdfs:subClassOf ex:Van .

ex:MiniVan rdfs:subClassOf ex:PassengerVehicle .

58

59 of 116

RDFS - defining classes and class hierarchies

59

ex:MotorVehicle

ex:Truck

ex:Van

ex:PassengerVehicle

ex:MiniVan

rdfs:subclassOf

rdfs:subclassOf

rdfs:subclassOf

rdfs:subclassOf

rdfs:subclassOf

60 of 116

RDFS - defining properties and property hierarchies

ex:Person rdf:type rdfs:Class .

ex:author rdf:type rdf:Property .

ex:author rdfs:range ex:Person .

ex:hasMother rdf:type rdf:Property .

ex:hasMother rdfs:range ex:Female .

ex:hasMother rdfs:domain ex:Person .

ex:age rdf:type rdf:Property .

ex:age rdfs:range xsd:integer .

exterms:weight rdfs:domain ex:Book .

exterms:weight rdfs:domain ex:MotorVehicle .

ex:driver rdf:type rdf:Property .

ex:primaryDriver rdf:type rdf:Property .

ex:primaryDriver rdfs:subPropertyOf ex:driver .

Big difference between RDFS and object-oriented programming (OOP): �RDF properties are first class citizens.

They can exist on their own, independently of any class.

60

61 of 116

RDFS - property hierarchies

61

ex:Person

ex:MotorVehicle

ex:driver

ex:primaryDriver

ex:driver a rdf:Property .

ex:primaryDriver a rdf:Property .

�ex:primaryDriver rdfs:subPropertyOf�ex:driver .

62 of 116

RDFS: label, comment, seeAlso

  • rdfs:label
    • Human readable name of a resource
  • rdfs:comment
    • Longer description of a resource
  • rdfs:seeAlso
    • Points to a resource that might provide more information about the subject resource
  • rdfs:isDefinedBy
    • In a sense not specified by RDF

62

63 of 116

RDF model: rdf:List for closed collection

63

rdf:List

my:item1

my:item2

rdf:type

rdf:first

rdf:rest

my:list

rdf:nil

rdf:first

rdf:rest

rdf:type

rdf:type

64 of 116

RDF model: containers for open collections

rdf:Bag, rdf:Seq, rdf:Alt

64

my:bag

my:item1

my:item2

my:item3

rdf:Bag

rdf:type

rdf:_1

rdf:_2

rdf:_3

65 of 116

RDF 1.1 Turtle - rdf:List shortcut

# the value of this triple is the RDF collection blank node

:subject :predicate ( :a :b :c ) .

# an empty collection value - rdf:nil

:subject :predicate2 () .

65

66 of 116

RDFS - RDF Schema 1.1

W3C Recommendation

  • 25 February 2014

Vocabulary for creation of other RDF vocabularies

RDF Vocabulary

  • collection of classes and properties and their definitions

66

Functionally different from other schema languages, e.g. XML Schema��RDF is “schema-less”

67 of 116

Open World Assumption (OWA)

open-world assumption is the assumption that the truth value of a statement may be true irrespective of whether or not it is known to be true”

-- Open-world assumption

67

Statement: "Mary" "is a citizen of" "France"

Question: Is Paul a citizen of France?

"Closed world" (for example SQL) answer: No.

"Open world" answer: Unknown.

68 of 116

Linked Data

68

69 of 116

Regular data ~ not linked

ID

Jméno

Hraje

1235

Joaquin Phoenix

Joker

1234

Robert De Niro

Joker

...

Hey, look at 1234, he’s cool!

Where and how do I get data about 1234?

Which Joker, the 5.6 or the 9.0 one?

[{

"id": "1234",

"id2": "az-11",

"name": {

"en": "Joker"

},

"rating": 9.0

}, {

"id": "1235",

"id2": "yt-18",

"name": {

"en": "Joker"

},

"rating": 5.6

}]

Stars in, or stars as?

Or this 1234?

This 1234?

70 of 116

Issues with regular, non-linked data

i.e. CSV, JSON, XML, Excel files...

  • Ambiguous identification of entities in data
    • Person with ID aaa1234 in a document located on my laptop in folder /data/temp/people.json
    • Another person with ID aaa1234 in the XML file on this CD�
  • Low findability and accessibility of data
    • Get data about person aaa1234 => Go to my laptop, open the folder, load/open the file, search/query
  • No contextual information
    • Person aaa1234 lives in Prague. I want to know more about Prague.�Where and how do I get the information?

70

ID

Name

Stars

1235

Joaquin Phoenix

Joker

1234

Robert De Niro

Joker

...

71 of 116

Issues => Additional requirements on data

i.e. CSV, JSON, XML, ...

  • Identification of entities in data
    • Global
    • Unique�
  • Findability and accessibility of data
    • Find data based on the identification
    • Access it in single format
  • Contextual information
    • When I access information, I want to know where and how to find more

71

Is there such a system?

72 of 116

The World Wide Web

Shared global space of documents

Built on top of several simple principles:

  1. HTML as a format for publishing documents
  2. URLs as unique global identifiers of documents
  3. HTTP for localization and accessing documents by their URLs
  4. hyperlinks between documents

There are two kinds of applications working in this space of documents:

  • web browsers (localizing and browsing documents through hyperlinks)
  • search engines (indexing and full text searching of documents)

72

HTML

HTML

HTML

HTML

Web browser

Search engine

HTTP

HTTP

73 of 116

The World Wide Web - what can we do with it?

  • Publish human-readable documents
  • Everyone can view them in their browser
    • if they know the URL
  • + links
    • To documents with yet unknown URL
    • From other documents
    • From catalogs
  • Fulltext search, keyword search

73

HTML

HTML

HTML

HTML

Web browser

Search engine

HTTP

HTTP

74 of 116

The World Wide Web - 30 years, Sir Tim Berners-Lee

74

75 of 116

From the Web to Linked Data

75

76 of 116

Web of Documents

76

�Lots of information about Prague in the Web of Documents. Problems:

  • Encoded in documents distributed across the Web
  • Documents intended for humans not computers
  • Documents about Prague or related things not linked
  • Computers not able to process data about Prague published on the Web

Prague budget

Basic info about Prague

Prague public contracts

Demography of Prague

EU funded projects in Prague

77 of 116

Web of Documents

77

�Queries, for which there is information:

  • Top 100 suppliers of Prague with headquarters outside of Prague region.
  • Money spent in Prague for new children playgrounds in the last 5 years per one child.
  • Organizations in Prague funded by EU structural funds and their top 100 suppliers.

Prague budget

Basic info about Prague

Prague public contracts

Demography of Prague

EU funded projects in Prague

78 of 116

The World Wide Web

  • Typically, there are underlying databases
  • From which, human readable documents are generated

  • ...and scraped by users who want to query it

78

HTML

HTML

HTML

HTML

Web browser

Search engine

HTTP

HTTP

Database A

Database B

Database D

Database C

79 of 116

The World Wide Web

Different APIs provide machine readable data for further processing in so called mash-up applications.

Also built on several simple principles:

  • XML/JSON as formats for publishing data
  • HTTP protocol for transferring data between APIs and applications

79

Database A

Database B

Database D

Database C

Mash-up App

Mash-up App

HTTP

Proprietary Data API A

HTTP

HTTP

HTTP

Proprietary Data API C

Proprietary Data API D

Proprietary Data API B

80 of 116

Social network silos

80

81 of 116

Problems with data on the current Web

81

Web of Documents

Current Web IS NOT Web of Data!

URLs as unique global identifiers of documents

no unique global identifiers of things

HTML as a format for publishing documents

many formats for publishing data (XML, JSON, CSV, XLS, ...)

HTTP for localization and accessing documents by their URLs

HTTP for localization of APIs and accessing them (REST) [but not for localization of things and accessing their data]

hyperlinks between documents

none of current formats enables us to link related entities

Can we apply the principles of the Web to data?

82 of 116

Linked Data ~ the Web of Data

Principles, “best practices” for publishing and linking data about entities on the Web.

  • Application of the proven principles of Web of Documents to data
  • 2 main goals
    • Machine readable and understandable data (based on the Semantic Web)
    • Providing context to data (via links to other data)

82

BEST PRACTICE

83 of 116

The principles of Linked Data

  1. Use URIs as names for things.
  2. Use HTTP URIs so that people can look up those names.
  3. When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL).
  4. Include links to other URIs so that they can discover more things.

83

84 of 116

Web of Documents without the first two principles

Web pages without URLs

  • What web page are you talking about?
  • Where do I get the web page you are talking about?
  • How do I get the web page you are talking about?
  • ...

84

ID

Name

Stars

1235

Joaquin Phoenix

Joker

1234

Robert De Niro

Joker

...

85 of 116

Things as first-class citizens

85

Project

CZ.2.16/2.1.00/22189

Prague City

Prague Council

Prague Demography

Prague Budget

Contract

DIL/23/07/007302/2010

86 of 116

1. + 2. Use HTTP URIs as names for things

86

Project

CZ.2.16/2.1.00/22189

praha.eu (Prague)

http://praha.eu/contract/7302

http://praha.eu/council

http://praha.eu/city

87 of 116

1. + 2. Use HTTP URIs as names for things

87

Project

CZ.2.16/2.1.00/22189

praha.eu (Prague)

http://praha.eu/contract/7302

http://praha.eu/council

http://praha.eu/city

mfcr.cz (Ministry of Finance)

http://mfcr.cz/

prague/budget

http://mfcr.cz/

prague

88 of 116

1. + 2. Use HTTP URIs as names for things

88

Project

CZ.2.16/2.1.00/22189

praha.eu (Prague)

http://praha.eu/contract/7302

http://praha.eu/council

http://praha.eu/city

mfcr.cz (Ministry of Finance)

http://mfcr.cz/

prague/budget

http://mfcr.cz/

prague

risy.cz (Regional Information Service)

http://risy.cz/

location/prague

http://risy.cz/contract/22189-01

http://risy.cz/

project/22189

czso.cz (Czech Statistical Office)

http://registry.

czso.cz/prague

http://czso.cz/

prague

http://czso.cz/prague/demogstat

89 of 116

The principles of Linked Data

  • Use URIs as names for things
  • Use HTTP URIs so that people can look up those names.
  • When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)
  • Include links to other URIs so that they can discover more things.

89

90 of 116

Web of Documents without the third principle

Web pages are in many formats, not only HTML

Thanks for the URI of your web page

  • In what language/format is your page?
  • Which software supports your language for pages?
  • How many browsers do you have?

… we all know this - how many times you click on a link and PDF/Word/Excel opens

90

91 of 116

Technical detour: HTTP Accept header and URIs

91

Web browser

esfcr.cz

HTTP

(HTML)

http://esfcr.cz/.../projekt/

CZ10421016300169

Applications

HTTP

(RDF)

<http://esfcr.cz/data/projekt/CZ10421016300169>

esf:nazev "INNOSTART" ;

esf:registracni_cislo "CZ.1.04/2.1.01/63.00169" ;

esf:castka "4711681" ;

esf:realizace_od "2011-06-01" ;

esf:realizace_do "2013-03-31" ;

esf:realizator <http://esfcr.cz/.../25438352> ;

esf:partner <http://esfcr.cz/.../25438352> ;

esf:kontaktni_osoba <http://esfcr.cz/.../8541274571>;

esf:region <http://esfcr.cz/.../ustecky> .

http://esfcr.cz/.../projekt/

CZ10421016300169

92 of 116

The principles of Linked Data

  • Use URIs as names for things
  • Use HTTP URIs so that people can look up those names.
  • When someone looks up a URI, provide useful information, using the standards (RDF, SPARQL)
  • Include links to other URIs so that they can discover more things.

92

93 of 116

Web of Documents without the fourth principle

Web pages without links to other pages

  • Imagine a Wiki page with no links
  • Where do I get more information about the things mentioned in the article?
  • Is the thing mentioned in the article really the one I think it is?

93

94 of 116

4. Include links to other URIs (provide context)

94

praha.eu (Prague)

http://praha.eu/contract/7302

http://praha.eu/city

mfcr.cz (Ministry of Finance)

http://mfcr.cz/

prague/budget

http://mfcr.cz/

prague

risy.cz (Regional Information Service)

http://risy.cz/

location/prague

http://risy.cz/contract/22189-01

http://risy.cz/

project/22189

czso.cz (Czech Statistical Office)

http://registry.

czso.cz/prague

http://czso.cz/

prague

http://czso.cz/prague/demogstat

http://praha.eu/council

95 of 116

4. Include links to other URIs (provide context)

95

praha.eu (Prague)

http://praha.eu/contract/7302

http://praha.eu/city

mfcr.cz (Ministry of Finance)

http://mfcr.cz/

prague/budget

http://mfcr.cz/

prague

risy.cz (Regional Information Service)

http://risy.cz/

location/prague

http://risy.cz/contract/22189-01

http://risy.cz/

project/22189

czso.cz (Czech Statistical Office)

http://registry.

czso.cz/prague

http://czso.cz/

prague

http://czso.cz/prague/demogstat

c: hasBeneficiary

a:fundedBy

b:hasBudget

http://praha.eu/council

d:hasDemography

96 of 116

4. Include links to other URIs (provide context)

96

praha.eu (Prague)

http://praha.eu/contract/7302

http://praha.eu/city

mfcr.cz (Ministry of Finance)

http://mfcr.cz/

prague/budget

http://mfcr.cz/

prague

risy.cz (Regional Information Service)

http://risy.cz/

location/prague

http://risy.cz/contract/22189-01

http://risy.cz/

project/22189

czso.cz (Czech Statistical Office)

http://registry.

czso.cz/prague

http://czso.cz/

prague

http://czso.cz/prague/demogstat

c: hasBeneficiary

a:fundedBy

b:hasBudget

http://praha.eu/council

d:hasDemography

owl:sameAs

owl:sameAs

97 of 116

The Web of Data

97

http://praha.eu/contract/7302

http://praha.eu/city

http://mfcr.cz/

prague/budget

http://mfcr.cz/

prague

http://risy.cz/

location/prague

http://risy.cz/contract/22189-01

http://risy.cz/

project/22189

http://registry.

czso.cz/prague

http://czso.cz/

prague

http://czso.cz/prague/demogstat

c: hasBeneficiary

a:fundedBy

b:hasBudget

http://praha.eu/council

d:hasDemography

owl:sameAs

owl:sameAs

98 of 116

Web of documents vs. Web of data (Linked Data)

98

Web of documents

Linked Data

HTML as document publication format

RDF as a data publication format

URL as a unique global document identifier

URL as a unique global entity identifier

HTTP protocol for accessing documents using their URL

HTTP protocol for accessing data about entities using their URL

Links to other documents

Links to other entities

vocabularies – standards for common data representation

99 of 116

Linked Open Vocabularies

Catalog of vocabularies used on the Web of Data

Basic rule - vocabulary reuse

  • Schema.org
  • Dublin Core Vocabulary
  • Data Cube Vocabulary
  • Simple Knowledge Organization System (SKOS)

99

https://lov.linkeddata.es/dataset/lov/

100 of 116

Linked Data = Technical interoperability solution

100

HTTPAPI

FTP

SOAP�WSDL

TriG dump

IRI dereference

SPARQL

#LD

101 of 116

Web of Data

101

If the data from those web sites was published as Linked Data, getting the answer to the queries, e.g.

  • e.g. Money spent in Prague for new children playgrounds in the last 5 years per one child.

would have only one step:��1. Write the query answering the question

(in the worst case there would be a preceding “download data” step if federated querying was not supported)

Prague budget

Basic info about Prague

Prague public contracts

Demography of Prague

EU funded projects in Prague

102 of 116

The Web of Data - will it be successful?

102

103 of 116

The LOD cloud - 2007 - 12 datasets

103

104 of 116

The LOD cloud - 2008 - 32 datasets

104

105 of 116

The LOD cloud - 2009 - 89 datasets

105

106 of 116

The LOD cloud - 2010 - 203 datasets

106

107 of 116

The LOD cloud - 2011 - 295 datasets

107

108 of 116

The LOD cloud - 2014 - 570 datasets

108

109 of 116

The LOD cloud - 2017 - 1146 datasets

109

110 of 116

The LOD cloud - 2018 - 1229 datasets

110

111 of 116

The LOD cloud - 2019 - 1239 datasets

111

112 of 116

The LOD cloud - 2020 - 1255 datasets

112

113 of 116

The LOD cloud - 2024 - 1346 datasets

113

114 of 116

Open data - 5 star classification�Legal and technical maturity of data

114

115 of 116

RDF - relation to conceptual model

ex:index.html a ex:Page .�ex:index.html dcterms:creator emp:85740 .�ex:index.html dcterms:subject "education" .�ex:index.html dcterms:language "en" .

emp:85740 a ex:Employee .�emp:85740 ex:number 42 .�

115

- number

Employee

�- language (dcterms:language)�- theme (dcterms:subject)

Page

creator�(dcterms:creator)

1..*

0..*

Person

ex:Page a rdfs:Class .�ex:Employee a rdfs:Class .�ex:Person a rdfs:Class .�ex:Employee rdfs:subClassOf ex:Person .

dcterms:creator a rdf:Property .�dcterms:subject a rdf:Property .�dcterms:language a rdf:Property .�ex:number a rdf:Property .

116 of 116

RDF - relation to conceptual model

ex:index.html a ex:Page .�ex:index.html dcterms:creator emp:85740 .�ex:index.html dcterms:subject "education" .�ex:index.html dcterms:language "en" .

emp:85740 a ex:Employee .�emp:85740 ex:number 42 .�

116

- number

Employee

�- language (dcterms:language)�- theme (dcterms:subject)

Page

creator�(dcterms:creator)

1..*

0..*

Person

ex:Page a rdfs:Class ;� rdfs:label "Page"@en ;� rdfs:comment "A web page"@en .��ex:number a rdf:Property ;� rdfs:label "Employee number"@en ;� rdfs:domain ex:Employee ;� rdfs:range xsd:integer .�