1 of 37

DBpedia Association hour

Sebastian Hellmann, Dimitris Kontokostas, Julia Holze

http://dbpedia.org

2 of 37

DBpedia Association

  • State of the Association
  • Technical Goals
  • Funding + Vision

3 of 37

DBpedia Association

  • State of the Association
  • Technical Goals
  • Funding + Vision

4 of 37

DBpedia Association (non-profit)

  • Founded in 2014
    • pushed by Sebastian Hellmann
  • Operational since January 2016
  • Backed up by Institute for Applied Informatics (InfAI), Leipzig
  • Support from DBpedia founding members (Soeren, Chris & Kinsley)
  • Draft charter online http://wiki.dbpedia.org/dbpedia-association
    • All data published by the DBpedia Association should be made available free of charge under a license equivalent to CC-0 or CC-BY without further restriction on commercial use and redistribution.

5 of 37

6 of 37

7 of 37

Bootstrapping the Advisory Committee and Community Committee is next on the todo list for the association

Markus Freudenberg

8 of 37

DBpedia Members

Organisational members:

  • Fraunhofer IAIS, OpenLink (Board Members)
  • iMinds, KB, Huygens (Dutch Chapter)
  • Linked Open Data Initiative (LODI, Japanese Chapter)

Membership type

Joined

Applying

Students

18

6

individual/Self-Employed

1

18

SME, Research Institute

1

2

Start-up/Small Research Group

4

9 of 37

DBpedia Chapters: Dutch DBpedia

Cooperation between:

    • Huygens ING, iMinds/Univ.Gent, Vrije Universiteit Amsterdam, Institute for Sound and Vision, Koninklijke Bibliotheek (National Library)
    • and the NL-DBpedia community

Manifested by a Memorandum of Understanding (MoU)

12 sept 2016

10 of 37

DBpedia Chapters: Dutch DBpedia

  • Intentions expressed in the MoU:
    • support the goals of the DBpedia Association
    • strengthen the Dutch DBpedia chapter and its community of contributors and users
    • improve the cooperation with Dutch research infrastructure (Digital Humanities, Linguistics) and Dutch Digital Heritage Network
  • Take responsibility, on a best effort basis:
    • keep NL-DBpedia healthy
    • implement/initiate new developments (specific data extraction)
    • organize the community: website, local community meetings
  • Chapter is executive part of the Association and receives:
    • Support for organisation of meetings
    • Financial support for hosting/infrastructure (if income available)
  • MoU available on the Association website as inspiration for other chapters!

11 of 37

Board Meetings

  • May and August 2016
  • Public minutes
  • Resolution resulted in a charter change:

8.4 Voluntary Payment Option for membership fee

    • Applicants can apply for reduction
    • Board approval needed, generally accepted for core contributors
    • Fees are reduced to 20€ per year, but member can pay full fee voluntarily

12 of 37

PR and Dissemination

  • Website
    • Major redesign, still some work todo
    • Community can publish projects to the project section
  • Twitter
  • Facebook
  • DBpedia Blog
  • Flyers
  • Stickers
  • First booth @SEMANTiCS

13 of 37

Next steps (during 2017)

  • More official chapters
    • Adjusting the MoU for other chapters
  • Better community synchronization
    • Monthly public telcos
  • Instantiate the Advisory and Community Board
  • Improve services and secure funding (next slides)
  • Approve the charter
    • Board agreed that funding is more urgent than structure

14 of 37

DBpedia Association

  • State of the Association
  • Technical Progress
  • Funding + Vision

15 of 37

Some numbers...

  • ~7M Hits/Day (3.3M @ 09/15) see here
    • Hoteli Maestral one of our very high traffic users
  • 38% improvement in schema conformant data (2015-04 -> 2016-04)
    • Measured with RDFUnit
    • Data cleanup processes
    • Mapping validation tool
    • Ontology committee
  • 2016-04

16 of 37

Release streamline

Too many pre/post/during steps

  • Step grouping & automating / Shorten time-to-release

Some highlights

  • DataID
    • Dockerized triple store / Automated Download page generation
  • Automated statistics generation
  • Validation checkpoints
    • Ensuring correct progress
    • Data validation filtering

17 of 37

Backlinking

https://github.com/dbpedia/links

  • DBpedians can push their links to be loaded into the main endpoint
  • (not yet published) Maven/Ant build system that updates, validates and packages links

Please wait for the announcement and send a GitHub pull request

(Members can get help from the association for linking and submitting)

18 of 37

Other things we are busy with

  • Further Wikidata integration
  • Global pagerank
  • Mapping alignment (UPM)
  • Mapping discovery & provenance
  • GSoC
  • DBpedia mappings to RML
  • Automated mapping extraction & suggestions
  • List / Table extractors
  • Lookup improvements
  • Topic classification
  • Event detection

19 of 37

Integration of sources

Two ways:

  1. fully automated
    1. automatic conversion
    2. automatic error detection
  2. highly-assisted workflows
    • semi-automatic suggestions
    • test-driven error detection of data errors
    • power tools
    • push to source

Mappings automation, validation & provenance

20 of 37

DBpedia Association

  • Organisational Structure
  • Technical Issues
  • Funding + Goals

21 of 37

Funding Goals

  • 2017 - build-up phase, DBpedia Association has basic funding for
    • Core staff
    • Some event and PR money
  • However:
    • no funding for hosting
    • no sustainable funding (bound to projects)
  • Goals
    • Increase quality of hosting and data
    • Provide support for community issues, e.g. service downtime, bugs, travel grants, etc.
    • Better publicity and exploitation (events, booths, flyers, add use cases to projects)
    • Merge community contributions (work by Heiko, Marco, etc. has not been included yet)
    • Systematically develop DBpedias public data, software and services

22 of 37

Funding Strategies

4 main strategies developed during the DBpedia board discussion:

  • Public fundraising (donation campaigns)
  • Direct fundraising (direct company support)
  • Membership fees
  • Community/Project fundraising (i.e. H2020, COST, ITN)

Help from the community welcome.

23 of 37

Strategy 1: Public fundraising

  • Via website
  • linked data interface or other services (was this service useful for you?)
  • No commitment needed of donator (one-of donation)

Current action items:

24 of 37

Strategy 2: Direct fundraising

  • Large companies use DBpedia
  • Several programs exist for supporting open-source projects
  • Not effective to write to info@google.com

Current action items:

Step 1: Identify potential organisations: IBM, Google, etc.

Step 2: establish contact

  • We are looking for contacts of community members to approach the right person in
  • Your ideas are welcome

25 of 37

Strategy 2: Subscription fees

  • DBpedia is used in education by university courses
  • Potential to receive subscription fees by universities and university libraries
  • Idea: create a DBpedia educational package
  • Teachers have to communicate the need to their university and libraries
  • Drawback:
    • Currently no know-how in the association how to set this up
    • Feedback from university teachers on how to position ourselves

26 of 37

Strategy 3: Membership fees

  • Good sustainability, i.e. yearly income
  • Good for building a network and establish relations
  • Does not scale well, i.e. linear twice the members is only twice the income
  • Not ideal for core community (invested work)

Following a board decision, we added this paragraph “8.4 Voluntary Payment Option” to the charter:

  • Membership fees prevent some community members to join
  • reasons no funding or not the right kind of funding or think it is wrong to pay on top of your community work
  • Please apply to option, board approves
  • Membership fees are reduced to 20 Euro yearly

27 of 37

Strategy 4: Community / Project funding

  • Differs from other strategies, often members receive the funding directly
  • EU grants like H2020 or national funding agencies
  • Proposal can contain:
    • Deliverables that support DBpedia, i.e. hosting a community event or improve data/software
    • DBpedia Association as a third party (similar to Europeana) for support, i.e. 50,000 for hosting or additional data in a release or improved extractors
    • Allocation of budget for membership fees of the consortium

Current action items:

  • Create a website with useful information and text snippets to copy into proposals
    • Jens Lehmann volunteered as direct contact for support: jens.lehmann@cs.uni-bonn.de

28 of 37

Strategy 4: Community / Project funding

  • Initial monthly telcos about funding
  • Not ideal, if ideas evolve not all telco participants can join the proposal
  • New strategy:
    • Reestablish telcos
    • Target “Network funding” such as COST action for traveling and events
  • Not as effective, but fair and open

We are looking for a driver, who sets up and maintains these telcos.

If you wish to join, please contact us.

29 of 37

How to make DBpedia better?

  • Uptime
  • Quality
  • Coverage
  • (new) Services
  • Next Meetings

https://pad.okfn.org/p/how-to-improve-DBpedia

30 of 37

Thanks for the feedback

31 of 37

DBpedia Groups

Ontology Working Group

Communications Group - Facebook, Slack Discussion about communication channels

Internationalisation Group

DataID

(planned) Wikidata Group, Law?

  • Ideas for groups? Association will help set up the group

32 of 37

Wikidata

  • Great source for DBpedia
  • Mission is different from DBpedia (Collect core facts)
  • Limited to notability

Integration, Enrichment�Quality Control�Increased Usefulness

33 of 37

DBpedia is ...

34 of 37

DBpedia is … a very ambitious project

99.3 % uptime�

25 TB downloadable data

one DBpedia Ontology �

decent linkage�

decent data quality

Identifiers based on�Wikipedia/Wikidata

DBpedia Spotlight ���

⇒ 99.99% uptime for all languages and chapters and spotlight, scalable hosting via the cloud

⇒ 500 TB

⇒ many consistent domain sub-ontologies for each domain from cars to gas turbines to amoeba to star systems to literature

⇒ highly inter- and backlinked, DBpedia can serve as an entry point to find exactly the data (or knowledge) you need

⇒ improved testing and quality control

⇒ Identifiers based on all existing authoritative and robust identifier providers… starting with company data

⇒ all languages, all domains, improved scalability

35 of 37

Funding

  • current liquidity is around 10,000€
  • some pre-financing via H2020 and German national funding

Services are stable, but lot’s of “what if’s”

  • what if OpenLink stops hosting the main endpoint
  • what opportunities do we miss without a properly oiled flagship

We need organisations to join and provide financial support, links and backlinks and data.

36 of 37

Funding

Become a member, if …

  • you profit from DBpedia in any way
  • you wish to get more in touch with DBpedia
  • the services mentioned on the flyer seem useful for your organisation

Include DBpedia in your proposals:

  • deliverables
  • subcontracting
  • consortium member (either association or the national chapters)

37 of 37

Hope to see you soon.

More information: http://dbpedia.org�Sign up: http://dbpedia.org/membership