Request edit access
Evaluation of Federated SPARQL Query Processing Systems
The aim of this survey is to provide a comparison of the existing SPARQL query federation engines. Could you please provide details of your system by filling the following short survey. Many thanks indeed.
Sign in to Google to save your progress. Learn more
System Information
Please provide  title and / or url of the paper and / or federation system *
Is code of your SPARQL query federation engine publicly available? *
If answer to the above is "Yes" please provide the link at which your code can be found
Implementation and licencing *
Please provide the implementation language (e.g Java, Perl) and  licencing (e.g GNU GPL V 2.0) information of your federation system
Type of source selection *
How do you carry out the source selection during the federation query process?
Type of join(s) used for data integration *
Please list the name of  the joins (e.g nested loop join, hash join etc) used in your system
Use of cache *
Do you make use of a cache for any type of performance improvement?
Comment (if any) regarding use of cache
Support for catalog/index update *
The index or catalog used by your system may get out of date with time. Is there any support for automatic index updates to ensure the retrieval of complete results?
Comment (if any) regarding Support for Index (catalog) update
Requirements
Question1: Result completeness *
Given a SPARQL 1.0 query, can your system retrieve all solutions to the given query (100% recall) or is it possible that it misses some of the solutions (for example due to  the source selection or using an out-of-date index). Note if your answer to the previous question is "No" then result completeness cannot be assured.
Comment (if any) regarding Question 1
Question 2: Policy-based query planning *
Most federation approaches target open data and do not provide means to take restrictions (according to different user access rights) on data access into account during query planning. As a result, a federation engine may select a data source for which requester is not authorized, thus over-estimating the data source and increase the query runtime.  Does your system have the capability of taking privacy information (e.g., different graph-level access rights for different users, etc.) into account during query planning?
Comment (if any) regarding Question 2
Question 3: Support for partial results retrieval. *
In some cases the query results can be too large and  result completeness (i.e 100% recall) may not be desired but rather partial but  fast and / or quality query results are acceptable. Does your system provide such functionality where a user  can specify  a desired recall (less than 100%) as a threshold for fast result retrieval?. Note that this is different from limiting the results using SPARQL LIMIT clause as it restricts the number of results to some fixed  value while in partial result retrieval the number of retrieved results are relative to the actual total number of results.
Comment (if any) regarding Question 3
Question 4: Support for no-blocking operator/ adaptive query processing *
SPARQL endpoints are sometimes blocked or down or exhibit high latency. Does your system support non-blocking joins (where results are returned based on the order that data arrive, not in the order in which data are requested)?
Comment (if any) regarding Question 4
Question 5: Support for provenance information *
Usually SPARQL query federation systems integrate results from multiple SPARQL endpoints without any provenance information, such as how many results were contributed by a given SPARQL endpoint or which of the results are contributed by each of the endpoint. Does your system provide such provenance information?
Comment (if any) regarding Question 5
Question 6: Query runtime estimation *
In some cases a query may have a longer runtime (e.g in minutes).  Does your system provide means to approximate and display (to the user) the overall runtime of the query execution, in advance?
Comment (if any) regarding Question 6
Question 7: Duplicate Detection *
Due to the decentralized architecture of Linked Data Cloud, one sub-query might retrieve results that were already retrieved by another sub-query. For some applications, the former sub-query can be skipped from submission (federation) as it will only produce overlapping triples. Does your system provide such a duplicate-aware SPARQL query federation? Note that this is the duplicate detection before sub-query submission to the SPARQL endpoints and aim is to minimize the number of sub-queries submitted by the federation engine.
Comment (if any) regarding Question 7
Question 8: Top-K query processing *
Is your system able to rank results based on the user's preferences, his/her profile, his/her location etc. ?
Comment (if any) regarding Question 8
Question 9: Supported SPARQL types/ clauses *
Please select query properties supported by your system
Required
Comment (if any) regarding Question 9
Further comments
Did we miss something?
Please feel free to add anything that we have missed. Thanks.
Submit
Clear form
Never submit passwords through Google Forms.
This content is neither created nor endorsed by Google. - Terms of Service - Privacy Policy

Does this form look suspicious? Report