1 of 44

SPARQL

Data Formats tutorial

2 of 44

Initial info

2

3 of 44

Exploratory queries

3

4 of 44

SKOS cheat sheet

  • https://prefix.cc/skos
  • Class IRI for a concept scheme, a code list: skos:ConceptScheme
  • Class IRI for code list item: skos:Concept
  • Predicate (property) IRI for concept label: skos:prefLabel

4

5 of 44

Task 1

Select IRIs and labels of all Concept schemes.

5

6 of 44

Task 1

Select IRIs and labels of all Concept schemes.

6

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

SELECT DISTINCT *

WHERE {

?s a skos:ConceptScheme ;

skos:prefLabel ?label .

}

7 of 44

Task 2

Select IRIs of all Concept schemes .�For those which have it also return their labels.

7

8 of 44

Task 2

Select IRIs and labels of all Concept schemes .�For those which have it also return their labels.

8

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

SELECT DISTINCT *

WHERE {

?s a skos:ConceptScheme .

OPTIONAL {

?s skos:prefLabel ?label .

}

}

9 of 44

Task 3

Which named graphs contain some skos:Concepts?

9

10 of 44

Task 3

Which named graphs contain some skos:Concepts?

10

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

SELECT DISTINCT ?g

WHERE

{

GRAPH ?g {?s a skos:Concept . }

}

11 of 44

Task 3.1

Which named graphs contain some skos:Concepts?�Sort alphabetically.

11

12 of 44

Task 3.1

Which named graphs contain some skos:Concepts?�Sort alphabetically.

12

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

SELECT DISTINCT ?g

WHERE

{

GRAPH ?g {?s a skos:Concept . }

}

ORDER BY ?g

13 of 44

Task 3.2

List first 10 named graphs which contain some skos:Concepts?�Sort alphabetically.

13

14 of 44

Task 3.2

List first 10 named graphs which contain some skos:Concepts?�Sort alphabetically.

14

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

SELECT DISTINCT ?g

WHERE

{

GRAPH ?g {?s a skos:Concept . }

}

ORDER BY ?g

LIMIT 10

15 of 44

Task 3.3

List second 10 named graphs which contain some skos:Concepts?�Sort alphabetically.

15

16 of 44

Task 3.3

List second 10 named graphs which contain some skos:Concepts?�Sort alphabetically.

16

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

SELECT DISTINCT ?g

WHERE

{

GRAPH ?g {?s a skos:Concept . }

}

ORDER BY ?g

LIMIT 10

OFFSET 10

17 of 44

Task 3.4

Select all graphs in an endpoint and a number of triples in them, ordered from the biggest one to the smallest one.

17

18 of 44

Task 3.4

Select all graphs in an endpoint and a number of triples in them, ordered from the biggest one to the smallest one.

18

SELECT ?g COUNT (?s)

WHERE

{

GRAPH ?g {?s ?p ?o}

}

GROUP BY ?g

ORDER BY DESC(COUNT(?s))

19 of 44

Task 4

Select IRIs and labels of all skos:ConceptSchemes from a given SPARQL endpoint that are labeled “Continent” in English

19

20 of 44

Task 4

Select IRIs and labels of all skos:ConceptSchemes from a given SPARQL endpoint that are labeled “Continent” in English

20

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

SELECT *

WHERE

{

?s a skos:ConceptScheme ;

skos:prefLabel ?label .

FILTER (?label = "Continent")

}

Won’t work.�“Continent” is of type xsd:string, because it has no language tag.�Labels in English have language tag @en and are of type rdf:langString

21 of 44

Task 4

Select IRIs and labels of all skos:ConceptSchemes from a given SPARQL endpoint that are labeled “Continent” in English

21

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

SELECT *

WHERE

{

?s a skos:ConceptScheme ;

skos:prefLabel ?label .

FILTER (?label = "Continent"@en)

}

Works, but not ideal.�Here, all concept schemes match the graph pattern and only then the results are filtered.

22 of 44

Task 4

Select IRIs and labels of all skos:ConceptSchemes from a given SPARQL endpoint that are labeled “Continent” in English

22

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

SELECT *

WHERE

{

?s a skos:ConceptScheme ;

skos:prefLabel "Continent"@en .

}

Fastest alternative.�The literal matches during the pattern matching phase.

23 of 44

Task 4

Select IRIs and labels of all skos:ConceptSchemes from a given SPARQL endpoint that are labeled “Continent” in English

23

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

SELECT *

WHERE

{

?s a skos:ConceptScheme ;

skos:prefLabel ?label .

FILTER (STR(?label) = "Continent")

}

Alternative.�STR extracts the textual value from ?label so that the comparison works

24 of 44

Task 5

Select IRIs and labels of all skos:ConceptSchemes from a given SPARQL endpoint that have a label starting with „C“.

24

25 of 44

Task 5

Select IRIs and labels of all skos:ConceptSchemes from a given SPARQL endpoint that have a label starting with „C“.

25

PREFIX skos: <http://www.w3.org/2004/02/skos/core#>

SELECT *

WHERE

{

?s a skos:ConceptScheme ;

skos:prefLabel ?label .

FILTER (STRSTARTS(STR(?label), "C"))

}

Here, there is no faster alternative.

26 of 44

Dataset specific queries

26

27 of 44

Initial info 1/2

We will be working with the Czech National Open Data Catalog (https://data.gov.cz) - available also in English.

The data in the endpoint follows the DCAT W3C Recommendation, and also its Czech application profile Rozhraní katalogů otevřených dat: DCAT-AP-CZ (in Czech)

You can get the necessary prefixes like this: https://prefix.cc/dct (for dct:)

You can also get the data view by navigating from a dataset detail in the UI:

27

28 of 44

Initial info 2/2

28

The link is to the IRI of the entity being viewed

This link’s URL is not the entity URL. It is the URL of the browser’s webpage viewing the entity.��You can click on it to get the entity from the About part.

29 of 44

Task 6

Select 100 datasets, which have distributions in the RDF TriG data format.

29

30 of 44

Task 6

Select 100 datasets, which have distributions in the RDF TriG data format.

30

PREFIX dcterms: <http://purl.org/dc/terms/>

PREFIX dcat: <http://www.w3.org/ns/dcat#>

SELECT ?dataset WHERE {

?dataset a dcat:Dataset ;

dcat:distribution ?distribution .

?distribution dcterms:format <http://publications.europa.eu/resource/authority/file-type/RDF_TRIG> .

}

LIMIT 100

31 of 44

Task 7

Select 100 datasets, which have distributions in the RDF TriG data format.�Include their names.

31

32 of 44

Task 7

Select 100 datasets, which have distributions in the RDF TriG data format.�Include their names.

32

PREFIX dcterms: <http://purl.org/dc/terms/>

PREFIX dcat: <http://www.w3.org/ns/dcat#>

SELECT ?dataset ?name

WHERE {

?dataset a dcat:Dataset ;

dcterms:title ?name ;

dcat:distribution ?distribution .

?distribution dcterms:format <http://publications.europa.eu/resource/authority/file-type/RDF_TRIG> .

}

LIMIT 100

However, some datasets have names both in Czech and English - those will be listed twice

33 of 44

Task 8

Select 100 datasets, which have distributions in the RDF TriG data format.�Include their names in English.

33

34 of 44

Task 8

Select 100 datasets, which have distributions in the RDF TriG data format.�Include their names in English.

34

PREFIX dcterms: <http://purl.org/dc/terms/>

PREFIX dcat: <http://www.w3.org/ns/dcat#>

SELECT ?dataset ?name

WHERE {

?dataset a dcat:Dataset ;

dcterms:title ?name ;

dcat:distribution ?distribution .

?distribution dcterms:format <http://publications.europa.eu/resource/authority/file-type/RDF_TRIG> .

FILTER(langMatches(LANG(?name), "en"))

}

LIMIT 100

Sometimes, language tags can be more than just @en. E.g. @en_GB for explicit indication of British English. langMatches() takes care of the comparison.�FILTER(LANG(?name) = "en") does not

35 of 44

Task 9

How many datasets, which have distributions in the RDF TriG data format, are there?

35

36 of 44

Task 9

How many datasets, which have distributions in the RDF TriG data format, are there?

36

PREFIX dcterms: <http://purl.org/dc/terms/>

PREFIX dcat: <http://www.w3.org/ns/dcat#>

SELECT (COUNT(DISTINCT ?dataset) AS ?count)

WHERE {

?dataset a dcat:Dataset ;

dcat:distribution ?distribution .

?distribution dcterms:format <http://publications.europa.eu/resource/authority/file-type/RDF_TRIG> .

}

37 of 44

Task 10

Is there a dataset with RDF TriG distribution?

37

38 of 44

Task 10

Is there a dataset with RDF TriG distribution?

38

PREFIX dcterms: <http://purl.org/dc/terms/>

PREFIX dcat: <http://www.w3.org/ns/dcat#>

ASK {

?dataset a dcat:Dataset ;

dcat:distribution ?distribution .

?distribution dcterms:format <http://publications.europa.eu/resource/authority/file-type/RDF_TRIG> .

}

39 of 44

Task 11

List datasets with more than 10 distributions, sort by the number of distributions.

39

40 of 44

Task 11

List datasets with more than 10 distributions, sort by the number of distributions.

40

PREFIX dcterms: <http://purl.org/dc/terms/>

PREFIX dcat: <http://www.w3.org/ns/dcat#>

SELECT ?dataset (COUNT(?distribution) AS ?count)

WHERE {

?dataset a dcat:Dataset ;

dcat:distribution ?distribution .

}

GROUP BY ?dataset

HAVING(COUNT(?distribution) > 10)

ORDER BY DESC(?count)

41 of 44

Task 12

Return RDF triples:

41

<dataset-uri> a dcat:Dataset;� rdf:value "number of distributions"^^xsd:integer .

42 of 44

Task 12

Return RDF triples:

42

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#>

PREFIX dcterms: <http://purl.org/dc/terms/>

PREFIX dcat: <http://www.w3.org/ns/dcat#>

CONSTRUCT {

?dataset a dcat:Dataset ;

rdf:value ?count .

} WHERE {

?dataset a dcat:Dataset .

{

SELECT ?dataset (COUNT(DISTINCT ?distribution) AS ?count)

WHERE {

?dataset dcat:distribution ?distribution .

}

}

FILTER(?count > 10)

}

<dataset-uri> a dcat:Dataset;� rdf:value "number of distributions"^^xsd:integer .

43 of 44

Task 13

List datasets with name starting with “H” and information, whether the dataset has a contact point specified.

43

44 of 44

Task 13

List datasets with name starting with “H” and information, whether the dataset has a contact point specified.

44

PREFIX foaf: <http://xmlns.com/foaf/0.1/>

PREFIX dcterms: <http://purl.org/dc/terms/>

PREFIX dcat: <http://www.w3.org/ns/dcat#>

SELECT ?dataset ?hasContactPoint

WHERE {

?dataset a dcat:Dataset ;

dcterms:title ?title .

OPTIONAL {?dataset dcat:contactPoint ?cp }

FILTER(STRSTARTS(?title, "H"))

BIND(BOUND(?cp) AS ?hasContactPoint)

}

LIMIT 100

BIND allows you to compute a value based on an expression and store it in a variable