@EvoMRI @readermeter
Daniel Mietchen Dario Taraborelli
Wikidata and Wikibase as global platforms for democratizing data publishing
SciDataCon 2018 • Gaborone, 6 November 2018
University of Virginia Wikimedia Foundation
Two main avenues for
democratizing data publishing
A knowledge graph that anyone can edit and query in their own language
A Wikidata-compatible graph database that anyone can set up and federate
Wilkinson et al. (2016) doi.org/10.1038/sdata.2016.18 [image: fosteropenscience.eu CC0]
FAIR data platforms
Wikidata is to data what Wikipedia is to text
�550M statements • 760M edits
[as of October 2018]
50 million entities
Types of content in Wikidata
...
...
Wikidata for Research
Data provenance of a Wikidata statements by outlet, publisher and funder
Zika virus • Q202864
TAXON�
has natural reservoir • P1605
Aedes hensilli • Q14573674�TAXON�
stated in • P248
Aedes hensilli as a potential vector of Chikungunya and Zika viruses • Q22330738�SCIENTIFIC ARTICLE�
funded by • P859
Centers for Disease Control and Prevention • Q583725�GOVERNMENT AGENCY
published in • P1433
PLOS Neglected Tropical Diseases • Q3359737�SCIENTIFIC JOURNAL�
publisher • P123
Public Library of Science • Q233358�PUBLISHER
Sample of current biomedical content in Wikidata
Biologists with Canadian citizenship
Institutions where Canadians got their PhD
Co-author graph of McGill-affiliated authors
Award recipients affiliated with McGill
Federation
Wikidata’s identifier mappings
From Wikidata to Wikibase
Why Wikibase?
Linked Jazz
“Started thinking about how our data could live in Wikidata and started investigating feasibility of that possibility.
But we have very esoteric project data that doesn’t seem appropriate to be in Wikidata so begain looking at our own Wikibase instance.”
Matt Miller (2018) Linked Jazz and Wikibase
What’s Wikibase
Data formats in Wikibase
(versus wikitext)
A French recording of the word “Canada”
From Lingua Libre
Letters sent by Illuminati
Timeline of software repositories
[SPARQL query] on Wikidata
Timeline of Wikibase instances
[SPARQL query] on the Wikibase registry
Wikibase and software repositories
Combined [SPARQL query]
across Wikidata and the Wikibase registry
Further notes
Stacy Allison-Cassin (2018) Wikibase & Indigenous Knowledge in the Canadian Context
Wikidata or Wikibase(s)?
Wikidata community | Governance | depends |
generic | Granularity | depends |
CC0 | Licensing | depends |
stable | Funding | depends |
stable | ID mappings | depends |
many | Language(s) | depends |
Another approach to democratization of data curation:
citizen science happening on Wikimedia projects
see SciDataCon poster 150
Thank you
growth by Fabio Rinaldi [CC BY], research by Minnie Pigeon [CC BY], �graph by Icon Lauk [CC BY] from the Noun Project
�
Slides mashed up with contributions by �Andy Mabbett and Andra Waagmeester
These slides are adapted from
D. Mietchen, D.Taraborelli (2018) Wikidata, Wikibase, and a federated ecosystem of structured knowledge for open science. FORCE 2018�doi.org/10.6084/m9.figshare.7195358 [CC BY]