1 of 19

Wikidata: Metadata and Tagging

Subscribe 7 (Frühjahr 2016), Berlin

Jens Ohlig (Wikimedia Deutschland)

2 of 19

Wikidata

Wikidata is a free linked database that can be read and edited by both humans and machines.

Wikidata acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia.

Wikidata can be used by anyone for everything because all data is in the Public Domain (CC0).

3 of 19

Giving more people more access to more knowledge

Wikidata

4 of 19

What is Wikidata actually?

  • repository of the world's knowledge
  • database anyone can read and edit
  • multi-lingual
  • designed to deal with the reality Wikipedia has to deal with
  • free and open source software

5 of 19

17,300,000+ items

6 of 19

Items

  • Items are real things or concepts. eg. Berlin, Barack Obama, Helium and are identified using a unique ID e.g. Q76 or Q13813879

  • Items have labels, descriptions, aliases, sitelinks and claims/statements

A single piece of knowledge in the world

7 of 19

Properties

  • Properties are used to label data e.g. “Born in” or “Date of Death” or “Location”

Some feature that a piece of knowledge can have

8 of 19

Statements

  • Statements hold information e.g.
    • P47 (shares border with) => Q64 (Berlin)
    • P1128 (employees) => 1,000+-100

  • Statements also have
    • Qualifiers, to expand on the information
    • References, telling you where the information is from

“This piece of knowledge has this feature (according to this source)”

9 of 19

Item:

Berlin (Q64)

Statement: Q64 has P1082: 3.5 Mio

Property:

Population (P1082)

10 of 19

Example Item

Berlin

Q64

11 of 19

12 of 19

13 of 19

Largest cities with a female mayor

14 of 19

Map of the U1 subway line in Berlin

15 of 19

Tagging at Yle (Finnish Broadcasting Company)

16 of 19

17 of 19

Tagging for TED

18 of 19

19 of 19

Thank you! Any questions?

Jens Ohlig

jens.ohlig@wikimedia.de