Evaluation of Federated SPARQL Query Processing Systems

Request edit access

JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

The aim of this survey is to provide a comparison of the existing SPARQL query federation engines. Could you please provide details of your system by filling the following short survey. Many thanks indeed.

System Information

Please provide title and / or url of the paper and / or federation system *

Is code of your SPARQL query federation engine publicly available? *

Yes

Not yet

If answer to the above is "Yes" please provide the link at which your code can be found

Implementation and licencing *

Please provide the implementation language (e.g Java, Perl) and licencing (e.g GNU GPL V 2.0) information of your federation system

Type of source selection *

How do you carry out the source selection during the federation query process?

Using SPARQL ASK queries

Using a catalog/index

Hybrid (ASK + catalog/index)

Using SPARQL SERVICE clause

Other:

Type of join(s) used for data integration *

Please list the name of the joins (e.g nested loop join, hash join etc) used in your system

Use of cache *

Do you make use of a cache for any type of performance improvement?

Yes

Comment (if any) regarding use of cache

Support for catalog/index update *

The index or catalog used by your system may get out of date with time. Is there any support for automatic index updates to ensure the retrieval of complete results?

Yes

Do not require (e.g for catalog/index-free approach)

Comment (if any) regarding Support for Index (catalog) update

Requirements

Question1: Result completeness *

Given a SPARQL 1.0 query, can your system retrieve all solutions to the given query (100% recall) or is it possible that it misses some of the solutions (for example due to the source selection or using an out-of-date index). Note if your answer to the previous question is "No" then result completeness cannot be assured.

Yes

May miss some

Comment (if any) regarding Question 1

Question 2: Policy-based query planning *

Most federation approaches target open data and do not provide means to take restrictions (according to different user access rights) on data access into account during query planning. As a result, a federation engine may select a data source for which requester is not authorized, thus over-estimating the data source and increase the query runtime. Does your system have the capability of taking privacy information (e.g., different graph-level access rights for different users, etc.) into account during query planning?

Yes

Partial

Comment (if any) regarding Question 2

Question 3: Support for partial results retrieval. *

In some cases the query results can be too large and result completeness (i.e 100% recall) may not be desired but rather partial but fast and / or quality query results are acceptable. Does your system provide such functionality where a user can specify a desired recall (less than 100%) as a threshold for fast result retrieval?. Note that this is different from limiting the results using SPARQL LIMIT clause as it restricts the number of results to some fixed value while in partial result retrieval the number of retrieved results are relative to the actual total number of results.

Yes

Comment (if any) regarding Question 3

Question 4: Support for no-blocking operator/ adaptive query processing *

SPARQL endpoints are sometimes blocked or down or exhibit high latency. Does your system support non-blocking joins (where results are returned based on the order that data arrive, not in the order in which data are requested)?

Yes

Comment (if any) regarding Question 4

Question 5: Support for provenance information *

Usually SPARQL query federation systems integrate results from multiple SPARQL endpoints without any provenance information, such as how many results were contributed by a given SPARQL endpoint or which of the results are contributed by each of the endpoint. Does your system provide such provenance information?

Yes

Partial

Comment (if any) regarding Question 5

Question 6: Query runtime estimation *

In some cases a query may have a longer runtime (e.g in minutes). Does your system provide means to approximate and display (to the user) the overall runtime of the query execution, in advance?

Yes

Comment (if any) regarding Question 6

Question 7: Duplicate Detection *

Due to the decentralized architecture of Linked Data Cloud, one sub-query might retrieve results that were already retrieved by another sub-query. For some applications, the former sub-query can be skipped from submission (federation) as it will only produce overlapping triples. Does your system provide such a duplicate-aware SPARQL query federation? Note that this is the duplicate detection before sub-query submission to the SPARQL endpoints and aim is to minimize the number of sub-queries submitted by the federation engine.

Yes

Partial (e.g when the results come in)

Comment (if any) regarding Question 7

Question 8: Top-K query processing *

Is your system able to rank results based on the user's preferences, his/her profile, his/her location etc. ?

Yes

Partial

Comment (if any) regarding Question 8

Question 9: Supported SPARQL types/ clauses *

Please select query properties supported by your system

SPARQL SERVICE key word

Filter clause

Unbound query predicates

Unbound query subjects

Optional clause

Distinct clause

Order By

Union

Negation

Regex

Limit

Construct query

Describe query

Ask query

Required

Comment (if any) regarding Question 9

Further comments

Did we miss something?

Please feel free to add anything that we have missed. Thanks.

Submit

Clear form

Never submit passwords through Google Forms.

This content is neither created nor endorsed by Google. - Terms of Service - Privacy Policy

Does this form look suspicious? Report

Forms