1 of 28

DBpedia Ontology Session

Making the Semantic Web rock-solid

1

2 of 28

Goal of the DBpedia Ontology Session

  • bootstrap a Unified Semantic Ontology Space (USOS)
  • centrally engineering data quality into the decentral Semantic Web
  • DBpedia principle:
    • take low coherent data (Wikipedia) and transform it �into a powerful interconnected knowledge space (DBpedia) �with upgraded usefulness
    • DBpedia URIs have provide interoperability in the �Linked Data Cloud for over a decade (coherence)
  • How can we apply these principles to ontologies?

2

3 of 28

Motivation - Why

“So, when is the Semantic Web coming?” �- Thorsten Berger, Professor, Software Engineering in Bochum

3

Software Repositories

Linked Data (Ontologies)

Other data repositories

  • Debian Packages
  • Android Apps
  • Maven Central
  • npm
  • PyPI
  • not named here
  • mostly including links to landing pages without download links

rock-solid

mixed - do you feel lucky?

Linked Data is better by magnitudes

4 of 28

How? What’s the lever?

Data Quality - Fitness for Use

  • Intrinsic dimensions (syntactic validity, semantic accuracy, consistency, conciseness and completeness)
  • Accessibility dimensions (availability, licensing, interlinking, security and performance)
  • Contextual & Representational dimensions read Zaveri et al. 2012 - Quality Assessment for Linked Data: A Survey

4

5 of 28

How? What’s the lever?

Data Quality - Fitness for use

  • FAIR - subcategory of data quality
  • wide field of opinions
  • Data Quality is not measurable directly, quantitative metrics are interpreted as quality indicators
  • Data Quality happens at the consumer side

5

6 of 28

How? What’s the lever?

Who pays the integration tax?

(amount of work needed to get an external artifact working in own system)

6

7 of 28

How? What’s the lever?

Who pays the integration tax?

  • centrally engineering data quality into the decentral Semantic Web �to bootstrap a Unified Semantic Ontology Space (USOS)
  • shared effort between:
    • publisher/creator
    • mediating software (repository, browser or Archivo)
    • consumer

7

8 of 28

Archivo

https://archivo.dbpedia.org/ Download all ontologies�https://databus.dbpedia.org/jfrey/collections/archivo-latest-ontology-snapshots

8

9 of 28

Archivo

9

10 of 28

Archivo

10

11 of 28

Archivo

11

12 of 28

Archivo

12

13 of 28

Archivo

Who pays the integration tax?

  • Notifying 1400 ontology publishers about the metrics/regulations? SHOULD
  • Some development hours at Archivo
  • Rating to inform the consumer about caveats

13

14 of 28

DBpedia Archivo

an augmented Ontology Archive

14

15 of 28

Overview

  • “Web Archive” for Ontologies
  • fully automated: discovery, versioning & testing
  • Current Status: 1368 Ontologies (June 2021) → the most exhaustive unified ontology space
  • ontology FAIRness → goal is the improvement of (re)usability

15

16 of 28

Finding an Ontology

16

17 of 28

Finding an Ontology

  • search the website or on the Databus, by text or with SPARQL

17

18 of 28

Finding an Ontology

  • search the website or on the Databus, by text or with SPARQL

18

19 of 28

Accessing Ontologies

19

20 of 28

Accessing Ontologies via an API

Accessing all (or any subset) Ontologies¹:

  • persistence on the Databus allows usage of the Databus Tool Stack
  • Example: Databus Collections → a view on files deployed to the Databus
  • Download all the latest versions as N-Triple files:

Accessing a single Ontology: e.g. https://archivo.dbpedia.org/download?o=http://dbpedia.org/ontology/

20

http://archivo.dbpedia.org/download?

o={ontology-URI}

query=$(curl -H "Accept:text/sparql" https://databus.dbpedia.org/denis/collections/latest_ontologies_as_nt)

files=$(curl -H "Accept: text/csv" --data-urlencode "query=${query}" https://databus.dbpedia.org/repo/sparql | tail -n+2 | sed 's/"//g')

while IFS= read -r file ; do wget $file; done <<< "$files"

21 of 28

Interoperable Ontologies

21

22 of 28

Interoperable Ontologies

22

23 of 28

Debug common Ontology Pitfalls

  • problems with correct ontology deployment
    • during the addition of an ontology
  • ontology star rating for usability
    • testing parsing, license and consistency
  • extensible SHACL library testing application compliance
    • automatic LODE documentation
    • compliance to Archivo itself

23

0x⭐ Ontology

  • not parseable
  • no license provided
  • (maybe) logically inconsistent

2x⭐ Ontology

  • parseable & retrievable
  • some license detected
  • license only human readable or not unified
  • logically inconsistent

4x⭐ Ontology

  • parseable & retrievable
  • unified license URI
  • logically consistent

24 of 28

Ontology Reusability

24

25 of 28

Ontology Reusability

  • access even already unavailable ontologies
  • cite a certain version of an ontology (identified with a timestamp)
  • persistent snapshots of any ontology version

25

http://archivo.dbpedia.org/download?

o={ontology-URI}

f={format}

v={version}

e.g. https://archivo.dbpedia.org/download?o=http%3A//www.georss.org/georss/&v=2020.08.10-110000&f=ttl

One REST request:

26 of 28

Summary

Archivo …

  • … is an exhaustive unified space for ontologies
  • … provides findable and easily accessible vocabularies
  • … has a star rating and other tests measuring the interoperability and (re)usabilty of ontologies
  • … tries to encourage following community standards for ontology metadata

26

27 of 28

Contribute to Archivo

27

28 of 28

References

  1. Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18
  2. J. Frey, D. Streitmatter, F. Götz, S. Hellmann, and N. Arndt. DBpedia Archivo - A Web-Scale Interface for Ontology Archiving under Consumer-oriented Aspects, In Semantic Systems. The Power of AI and Knowledge Graphs, 2020. https://doi.org/10.1007/978-3-030-59833-4_2

28