Data Visualization with GraphDB and Workbench

2 June 2017: added Statistical Visualizations
6 June 2017: added Built-in SPARQL Result Visualizations and more details in Grammatical Framework (MOLTO project)
7 June 2017: added R2RML Generation, a couple viz in Builtin Overview Visualizations
25 Oct 2017: added LV ViziQuer
27 Mar 2019: added more SPARQL query builders from a DBpedia project list
28 May 2019: added AKSW Neural SPARQL Machines (seq2seq)
20 July 2022: added Sparnatural
1 Sep 2022: added RDF Explorer, Wikidata Query Builder, Wikidata Query UI
25 Sep 2022: updated LV ViziQuer: examples, source code links, papers
15 Sep 2023: added qb.js

Abstract: We describe briefly SPARQL editing and data visualization features available in GraphDB Workbench (GDB WB), or such that can be added with little programming. We also describe SPARQL writing aids and visualization tools that can be integrated with GraphDB.
This report: source, HTML
Presentation source, published presentation
Webinar: folder, abstract, video

Contents

1 Writing SPARQL

1.1 Built-in SPARQL Result Visualizations

1.2 Using SPARQL Results in Spreadsheets

2 Invoking SPARQL Queries

2.1 Difference Between Interactive and Programmatic Endpoint

2.2 Query Parameters

2.3 Finding the Query Link

2.4 Invoking a Query

3 Help With Writing SPARQL Queries

3.1 IRISA SQUALL

3.2 IRISA SPARKLIS

3.3 LV ViziQuer

3.4 YoungProgrammer NLQuery

3.5 Hokudai WC3

3.6 AKSW AutoSPARQL

3.7 AKSW Neural SPARQL Machines

3.8 Machinalis QuePy

3.9 Grammatical Framework

3.10 Cognitum SmartBI

3.11 NLI GO

3.12 ExConQuer Framework

3.13 OpenLink iSPARQL Query Builder

3.14 Sparna Sparnatural

3.15 RDF Explorer

3.16 Wikidata Query Builder

3.17 Wikidata Query UI

4 Statistical Visualizations

5 Graph Visualizations

5.1 Builtin Overview Visualizations

5.2 Builtin Graph Visualizations

5.3 Developing Graph Visualizations

5.3.1 Flight Routes

5.3.2 Companies and Persons

5.3.3 Company Relations

6 Visualization Toolkits

6.1 Declarative Visualization

6.2 RDF by Example

6.3 R2RML Generation

6.4 Data Access API

1 Writing SPARQL

The way you get data out of a semantic repository is with SPARQL. Thanks to integrating the excellent YASGUI editor, GDB offers 3 very useful kinds of auto-completion:

Prefix: as soon as you type a prefix that is defined in the repository, it is added to the query. If you paste a query without prefixes, the query text is parsed and prefixes are added. The easiest way to add prefixes is to put all prefixes relevant to your project in a turtle file, and load it to GDB
Class: if you are at a position where a class URI is expected, type a prefix, then Control-Space, all classes in that namespace are auto-completed. You can also type a few letters of the class local name to reduce the number of choices. Of course, this depends on having loaded the relevant ontologies.

Property: if you are at a position where a property URI is expected, type a prefix, then Control-Space, all properties in that namespace are auto-completed. You can also type a few letters of the property URI to reduce the number of choices.

1.1 Built-in SPARQL Result Visualizations

Through Yasgui integration, GraphDB Workbench can show charts from SPARQL results using Google Charts and pivots. For example, let's try query F4: Top-level industries by number of companies on http://factforge.net:

After invoking it, go to Google Chart> Chart Config and select an appropriate one (bar, line, pie, etc). There are also Pivot Tables and charts that allow you to analyze 2- and more- dimensional result sets.

You can also download results in a number of formats (TSV and CSV are most useful) for analysis in other programs, e.g. Google Sheets or Excel.

Two more examples from Getty Vocabularies sample queries (where they are made with external tools, but we recreated them in Workbench):

Column Chart with sgvizler (members of the United Nations by year)

Pope Reign Durations (scatter plot)

1.2 Using SPARQL Results in Spreadsheets

The Google Sheet FactForge-Industries imports the data of the above query, and then makes a similar chart.

The top left cell has this formula. It uses a long ugly URL, see section Finding the Query Link to understand how to construct it.

=IMPORTDATA("http://factforge.net/repositories/ff-news?query=%23+F4%3A+Top-level+industries+by+number+of+companies%0A%23+-+benefits+from+the+mapping+and+consolidation+of+industry+classifications%0A%23+++and+predicates+in+DBPedia+done+in+the+FactForge%0A%23+-+benefits+from+reasoning+-+transitive+and+symmetric+properties+across%0A%23+++the+industry+classification+taxonomy+of+FactForge%0A%0APREFIX+dbo%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2F%3E%0APREFIX+ff-map%3A+%3Chttp%3A%2F%2Ffactforge.net%2Fff2016-mapping%2F%3E%0A%0ASELECT+DISTINCT+%3Ftop_industry+(COUNT(*)+AS+%3Fcount)%0A%7B%0A+++%3Fcompany+dbo%3Aindustry+%3Findustry+.%0A+++%3Findustry+%5Eff-map%3AindustryVariant+%2F+ff-map%3AindustryCenter+%3Ftop_industry+.%0A%7D%0AGROUP+BY+%3Ftop_industry+ORDER+BY+DESC(%3Fcount)+")

The third column has a formula like this to extract industry names from the industry URL (a better way would be to fetch the rdfs:label of each industry as part of the query)

=regexreplace(A2,"http://dbpedia.org/resource/","")

Then we have added a chart of the second and third column

2 Invoking SPARQL Queries

Not everyone knows SPARQL, so it is important to be able to give queries to other people so they can invoke them, or use a query link to invoke the query from another program. To this end, you need to understand several things as described below.

2.1 Difference Between Interactive and Programmatic Endpoint

GDB WB includes an interactive endpoint where one can enter and edit queries (e.g. http://factforge.net/sparql). But for programmatic access to query results, one should find the "straight" endpoint. To find it, see Help>REST API. First find the repo name: http://factforge.net/rest/repositories returns the list of repos as JSON:

curl http://factforge.net/rest/repositories | jq .

## there's another called SYSTEM ...

"id": "ff-news",

"title": "ff-news",

"uri": "http://factforge.net/repositories/ff-news",

The second part of the REST API shows the SPARQL endpoint, e.g. in this case http://factforge.net/repositories/ff-news

2.2 Query Parameters

If your query has some variable parts, you want to be able to bind these parameters with some values at the time you invoke the query. This is explained in the rdf4j documentation (same as the older Sesame documentation). Jena has a similar feature, and the Data Incubator Linked Data Patterns book explains the concept. By convention, parameters are preceded by "$", as opposed to query parameters which are preceded by "?". For example, here's a query to return industries of a given $company from FactForge's copy of DBpedia:

PREFIX dbo: <http://dbpedia.org/ontology/>

SELECT ?industry {$company dbo:industry ?industry}

You can bind a value to the query parameter by including a corresponding HTTP request parameter, see below. You need to format the value in NTriples format (unfortunately you cannot use prefixes), e.g.

<http://dbpedia.org/resource/Google> for a URL
"Google" for a plain string
"Google"@en for a string with language tag
"2017-05-25"^^<http://www.w3.org/2001/XMLSchema#date> for a date with appropriate XSD type

2.3 Finding the Query Link

After editing a query until you are satisfied with the results, you can get a link to invoke the query from GDB WB:

The link will use the interactive endpoint, e.g. http://factforge.net/sparql, replace it with the programmatic endpoint
The link includes a parameter ?name= that is not needed by the programmatic endpoint, you can remove it

If you need to pass some parameters, add them like &$company= below

2.4 Invoking a Query

To invoke a query, do HTTP GET on the link constructed above. By default the result format is CSV (text/csv).

If you want TSV, add an appropriate Accept header: text/tab-separated-values

For example, invoking the query above with parameter dbr:Google and returning TSV:

curl -H Accept:text/tab-separated-values "http://factforge.net/repositories/ff-news?infer=true&sameAs=false&query=PREFIX+dbo%3A+%3Chttp%3A%2F%2Fdbpedia.org%2Fontology%2F%3E%0ASELECT+%3Findustry+%7B%24company+dbo%3Aindustry+%3Findustry%7D%0A&$company=<http://dbpedia.org/resource/Google>"

?industry

<http://dbpedia.org/resource/Software>

<http://dbpedia.org/resource/Internet>

<http://dbpedia.org/resource/Mobile_device>

<http://dbpedia.org/resource/Cloud_computing>

3 Help With Writing SPARQL Queries

There are various tools that can help you write SPARQL queries. We'd be happy to help you deploy one of these tools on a GDB SPARQL endpoint. Many of these are research prototypes, and some of them are based on Controlled Natural Language (CNL).

3.1 IRISA SQUALL

SQUALL (Semantic Query and Update High-Level Language) is a project for querying RDF data in CNL by Sebastien.Ferre@irisa.fr, head of the Sem LIS research team at IRISA, University Rennes 1. It uses Montague grammars and claims "SQUALL has a strong adequacy with RDF, and covers all SPARQL 1.0 constructs, and many of SPARQL 1.1. Its syntax completely abstracts from low-level notions such as bindings and relational algebra. It features disjunction, negation, quantifiers, built-in predicates, aggregations with grouping, and n-ary relations through reification."

See publications ^[1] ^[2] ^[3], examples. E.g. the question

Which person is an author of at least 10 publication-s?

translates to:

SELECT DISTINCT ?x1 WHERE {

?x1 a :person .

{SELECT DISTINCT ?x1 (COUNT(DISTINCT ?x3) AS ?x2) WHERE {

?x3 a :publication .

?x3 :author ?x1 .

} GROUP BY ?x1}

FILTER (?x2 >= 10)

}

We tried it on the Getty Vocabularies LOD with a question

Which subject-s have at least 3 prefLabel-s?

SQUALL came up with the following query that is quite adequate:

SELECT DISTINCT ?x1 WHERE {

?x1 a :Subject .

{SELECT DISTINCT ?x1 (COUNT(DISTINCT ?x3) AS ?x2) WHERE

{?x1 :prefLabel ?x3}

GROUP BY ?x1}

FILTER (?x2 >= 3)}

3.2 IRISA SPARKLIS

Sparklis is a project by the same group at IRISA. It reconciles expressivity and usability in semantic search by tightly combining a Query Builder, a Natural Language Interface, and a Faceted Search system.

It is an excellent UI that guides a user in writing a query in English. It's incrementally converted to SPARQL, and partial results are shown in facets to provide better guidance. I've explored several CNL for SPARQL, but this one is the most practical and usable. See Youtube video, Demo, Examples, paper ^[4]

Original announcement Tue 4/11/2017 by S. Ferré on the dbpedia-discuss mailing list:

"As a researcher in the Semantic Web, I developed in the past 3 years Sparklis, a tool to allow people to explore and query SPARQL endpoints without any knowledge of SPARQL. It combines the following features:

step-by-step construction of queries, from simple to complex
rendering of queries and suggested query refinements in natural language (3 supported languages: English, French, Spanish)
results presented as a table, and available at every step of the query construction
suggested query refinements are filtered to avoid reaching empty results (like in faceted search but for RDF queries)
direct connection to endpoints, responsive for large datasets like DBpedia
high coverage of SPARQL SELECT queries: BGP, unions, optionals, negations, filters, expressions, aggregations, ordering"

DBpedia is already accessible through Sparklis at this URL. It may be valuable to propose Sparklis as a query builder in the "Online access" section of the DBpedia website. To the best of my knowledge, there is no equivalent tool. By the way, I am very interested by any feedback on the tool. I am also interested to collaborate on any needed improvement.

Sparklis has recently been adopted by Persée where it is used by researchers in social sciences, and Sparklis has then received more than 1000 hits per month.

SPARKLIS lets users build complex queries by composing elementary queries in an incremental fashion.
An elementary query can be a class (e.g., "a film"), a property (e.g., "that has a director"), a RDF node (e.g., "Tim Burton"), a reference to another node (e.g., "the film"), or an operator (e.g., "not", "or", "highest-to-lowest", "+", "average").
SPARKLIS covers a large subset of SPARQL 1.1 SELECT queries: basic graph patterns including cycles, UNION, OPTIONAL, NOT EXISTS, FILTER, BIND, complex expressions, aggregations, GROUP BY, ORDER BY.
Query features can be combined in a flexible way, like in SPARQL.
Results are presented as tables and maps.
Configuration options allow it to adapt to different endpoints (e.g., GET/POST, labelling properties and language tags).
Includes the YASGUI editor to let advanced users access and modify the SPARQL query.
There are 100+ example queries over several datasets, and a number of them have a YouTube screencast to show how they are built.

Below you see a reading of the query in English, and the results as table and shown on a map.

3.3 LV ViziQuer

http://viziquer.lumii.lv/: Graphically construct and execute rich data analysis queries over RDF data . Development of this tool started in 2008 and continues until today.

Includes examples of queries over complex schemas:

DBpedia

Europeana

Wikidata

Hospital data

Papers (newest first):

Schema-Backed Visual Queries over Europeana and Other Linked Data Resources (ESWC 2021)
Visual Presentation of SPARQL Queries in ViziQuer (VOILA 2021)
A UML-style Visual Query Environment over DBPedia (MTSR 2021)
ViziQuer: a Visual Notation for RDF Data Analysis Queries (MTSR 2018)
ViziQuer: a Web-based Tool for Visual Diagrammatic Queries over RDF Data (ESWC 2018)
K.Cerans, J.Barzdins, A.Sostaks, J.Ovcinnikova, L.Lace, M.Grasmanis and A.Sprogis. Extended UML Class Diagram Constructs for Visual SPARQL Queries in ViziQuer/web In Voila!2017, CEUR Workshop Proceedings, Vol.1947
K.Cerans, J.Ovcinnikova. ViziQuer: Notation and Tool for Data Analysis SPARQL Queries. In Voila!2016, CEUR Workshop Proceedings, Vol.1704
K.Cerans, J.Ovcinnikova, M. Zviedris. SPARQL Aggregate Queries made easy with Diagrammatic Query Language ViziQuer. In ISWC 2015 Posters & Demonstrations Track, CEUR Workshop Proceedings, Vol.1486
K.Cerans, J.Ovcinnikova, M. Zviedris. Towards Graphical Query Notation for Semantic Databases. In BIR 2015, Springer LNBIP.
G. Barzdins, S.Rikacovs, M. Zviedris. Graphical Query Language as SPARQL Frontend. In Local Proceedings of 13th East-European Conference (ADBIS 2009), pp. 93-107. Riga Technical University, Riga. (2009)
G.Barzdins, E.Liepins, M.Veilande, M.Zviedris, Ontology Enabled Graphical Database Query Tool for End-Users, Selected papers from DBIS'2008, Hele-Mai Haav (Eds.), Frontiers in Artificial Intelligence and Applications series. 187:105-116

Related software:

Schema extraction, storage and retrieval services.
OBIS: Ontology Based Information System. Get a web based information system just from data ontology. Also used for schema extraction.
Uses OWLGrEd for visualizing the schema (OWLGrEd is a tool for graphical OWL ontology visualization and authoring).

For example, here is the hospital data schema of the hospital queries example shown above:

Source code:

https://github.com/LUMII-Syslab/viziquer
https://github.com/LUMII-Syslab/viziquer/tree/dss-based-schemata: branch to work with data-shape-server
https://github.com/LUMII-Syslab/data-shape-server
https://github.com/LUMII-Syslab/data-shape-retrieval-services
https://github.com/LUMII-Syslab/OBIS-SchemaExtractor

3.4 YoungProgrammer NLQuery

This blog describes a system that uses NLP techniques to parse questions and converts them to SPARQL queries for Wikidata. See demo, source. For example, the question

"Who is Obama's wife?"

is parsed as

(SBARQ

(WHNP (WP Who))

(SQ (VBZ is) (NP (NP (NNP Obama) (POS 's)) (NN wife)))

(. ?))

which is then simplified to this "typical question":

"prop":"wife",

"qtype":"who",

"subject":"obama"

and is then translated to this query:

SELECT ?valLabel ?type

WHERE {{

wd:Q76 p:P26 ?prop .

?prop ps:P26 ?val .

OPTIONAL {

?prop psv:P26 ?propVal .

?propVal rdf:type ?type .

}}

SERVICE wikibase:label { bd:serviceParam wikibase:language "en"}}

3.5 Hokudai WC3

The University of Hokudai has developed a tool that can generate SPARQL for a specific class of queries (eg "Songs written by Paul McCartney") and check the results against Wikipedia Categories.

Eg http://wnews.ist.hokudai.ac.jp/wc3?search=Songs+written+by+Paul+McCartney makes a query like this:

?s rdf:type <http://dbpedia.org/ontology/MusicalWork> .

?s <http://dbpedia.org/ontology/writer> <http://dbpedia.org/resource/Paul_McCartney>.

that returns 128 songs. The quality of this resultset is estimated as Precision = 0.93, Recall = 0.92 based on articles listed in Category:Songs_written_by_Paul_McCartney.

3.6 AKSW AutoSPARQL

AutoSPARQL converts a natural language expression to a SPARQL query. See papers ^[5] ^[6].

It leverages:

Natural language processing for creating sophisticated semantic representations of questions
Inductive active learning for incorporating user feedback using DL-Learner
Results of the BOA project, which matched DBpedia predicates and classes to NLP expressions, based on Information Extraction over Wikipedia (learning surface forms of predicates). For example, it has found the many ways in which journalists may say "companyX acquired companyY" in the news.

Currently there is no working demo :-(

3.7 AKSW Neural SPARQL Machines

Neural SPARQL Machines (NSpM) uses "seq2seq" neural networks to translate natural language to SPARQL. That uses a lot of examples to train a neural network.

A generator module builds the training set from manually- or automatically-created templates and a knowledge base (DBpedia) by leveraging the DBpedia query log
Entity recognition and query construction are entirely assigned to a LSTM-based recurrent neural network.
Uses external word embeddings to tackle the vocabulary mismatch problem
Curriculum learning is employed to learn graph pattern compositions.

Links:

Project site: Neural SPARQL Machines
Source: https://github.com/AKSW/NSpM
Presentation: Translating Natural Language into SPARQL for Neural Question Answering (Leipzing 2018-06)
Papers ^[7] ^[8] ^[9]

Unfortunately I cannot find a demo.

3.8 Machinalis QuePy

QuePy is a python framework to transform natural language questions to SPARQL. It can be customized to different kinds of questions.

Source: https://github.com/machinalis/quepy
Demo: http://quepy.machinalis.com/

For example

When was Friends released?

is translated to

SELECT DISTINCT ?x1 WHERE {
?x0 rdf:type dbo:Film.
?x0 dbp:name "Friends"@en.
?x0 dbo:releaseDate ?x1.
}

3.9 Grammatical Framework

Grammatical Framework (GF) is a CNL framework that allows you to define an abstract grammar for a domain and then "surface grammars" in various kinds of languages. This allows Assisted text entry. An outstanding feature of GF is that it guides the user in writing only correct text, e.g. the fridge magnets demo allows one to enter sentences about food. By including resources for multiple languages, GF can provide multilingual translations.

The translation demo shows the inter-language correspondence of text and provides translation.

When one of the surface languages is SPARQL, this enables CNL to SPARQL translation.

Ontotext used GF in the FP7 MOLTO project: see relevant publications, in particular ^[10] ^[11] ^[12] ^[13].We translated:

CNL queries about museum objects to SPARQL. E.g. the following question in English or Swedish is translated to SPARQL

RDF knowledge about museum objects to natural language (lexicalization). E.g. automatically generated painting description in a dozen languages.

3.10 Cognitum SmartBI

Cognitum are makers of the Fluent Editor for entering and querying RDF data using CNL. SmartBI is a tool for making business-intelligence queries in CNL. See product page and youtube video.

3.11 NLI GO

NLI GO is a generic natural language interaction (NLI) library written in Go that provides a natural language interface to databases. It allows the user to ask questions in English, which are answered by looking up information in semantic repositories or databases.

There is a demo using DBPedia via SPARQL queries. The scope of the demo is quite limited, you can only ask the following questions:

Who married <PERSON>
How many children had <PERSON>
Who was <PERSON>'s father
Who was <PERSON>'s mother
When was <PERSON> born

The library is open source and uses a number of advanced techniques, so it could be extended to handle more complex questions:

An Earley parser to create a syntax tree from an input sentence, with semantic attachments
Mentalese, a Predicate Logic based internal language to process the user input
A Datalog implementation for rule based reasoning
A DBPedia and a MySQL connection as well as an in-memory "database"
A dialog context to remember information earlier in the conversation
A quantifier scoper that allows aggregations
A query optimiser that uses cost-per-relation to determine the best order of executing a query
A generator to produce human readable responses

3.12 ExConQuer Framework

ExConQuer consists of two tools:

Query Builder: Works on top of DBpedia (or other SPARQL endpoint) and enables users to construct a SPARQL query without requiring knowledge of SPARQL or the datasets' underlying schema. Users are then able to download the data they require in a number of different formats. (This is similar to the dataset exploration features of SPARKLIS)
PAM: A faceted browser that allows users to browse and re-use any queries executed within the Query Builder, and either directly download the results or otherwise re-load the query in the Query Builder and edit it accordingly.

A demo video is available on Vimeo.

Both the tool's home page and demo installation are down, so development seems to have stopped.

3.13 OpenLink iSPARQL Query Builder

The iSPARQL query builder is a drag-and-drop visual interface for construction of SPARQL queries. It is deployed on DBpedia and some other data sets. It allows you to drag constructs from a toolbar and creates a respective query.

3.14 Sparna Sparnatural

Sparnatural (https://sparnatural.eu) is a SPARQL assistant by Sparna. It looks a lot like the ResearchSpace Semantic Search that we did for the British Museum in 2012-2014 (see confluence pages) and then was taken up by Metaphacts in the Metaphactory product. See eg this demos with French National Library (BnF) data: https://sparnatural.eu/demos/demo-bnf/index.html

3.15 RDF Explorer

Demo: https://rdfexplorer.org/ (click through the security warnings)
Source: https://github.com/hvarg/RDFExplorer

It has a data explorer that allows you to explore instances of interest, and make a subgraph of some of its connections.

Eg here we show the famous Окото ("the Eye") lake in the Rila mountains of Bulgaria:

Then we can generalize some of the nodes to variables, eg to find all glacial lakes in Rila. The generated query and its results are shown:

Has demos for DBpedia and Wikidata, but can access only Wikidata direct (truthy) statements, not structured statements with qualifiers and references.

3.16 Wikidata Query Builder

Service: https://query.wikidata.org/querybuilder/

Wikidata items and properties use numbered URLs and many users find this hard to work with. There is a simple query builder with auto-completion that however is still very limited, and can show results only as a table

3.17 Wikidata Query UI

The Wikidata Query UI has excellent auto-completion. Eg if you type the following and press control-space where indicated, Wikidata fills in the correct numeric IDs (wdt:Pnnn for properties, and wd:Qnnn for items):

?item

wdt:instance of<c-space> wd:glacial lake<c-space>;

wdt:country<c-space> wd:bulgaria<c-space>

However, one needs to know the structure of SPARQL and the structure of the Wikidata ontology to be able to write queries. Eg here is a more advanced example that shows all glacial lakes in Bulgaria (concentrated in the Rila and Pirin mountains), together with their coordinates and image, on a map view:

#defaultView:Map

SELECT DISTINCT ?item ?itemLabel ?coords ?image WHERE {

?item wdt:P17 wd:Q219.

?item wdt:P31/wdt:P279* wd:Q211302.

optional {?item wdt:P625 ?coords}

optional {?item wdt:P18 ?image}

SERVICE wikibase:label { bd:serviceParam wikibase:language "[AUTO_LANGUAGE],bg". }

}

We use the wikibase:label service to pick a label in the user's language (e.g. English), and otherwise in Bulgarian. Here is the result:

Starting from such query, there is a Query Helper that allows a user who doesn't know SPARQL to change the query:

Eg here is a map of 30 museums in Greece:

4 Statistical Visualizations

GraphDB Workbench includes some Built-in SPARQL Result Visualizations that can represent 2-dimensional data (Pivot Charts). In this section we describe more sophisticated approaches to such data.

There is a well-established ontology for representing statistical data in RDF: W3C Data CUBE. It incorporates an OLAP data model and statistical classifications following SDMX. There is a number of statistical datasets available as RDF, including:

Linked SDMX Data developed by Sarven Capadisli: International Monetary Fund IMF, OECD, UN Food and Agriculture Organization FAO, Swiss Federal Statistical Office BFS, European Central Bank ECB, World Bank, Transparency International.
Eurostat developed by the LOD Around the Clock (LATC) project (static)
Eurostat wrapper developed by Benedikt Kämpgen (updateable)
US Securities and Exchange Commission SEC Edgar Wrapper developed by Benedikt Kämpgen
UN ComTrade developed by the Multisensor project

The same link Data Cube Implementations describes a number of specialized statistical data visualization tools. Below we describe two of them, and a more powerful CubeViewer that however does not yet work with W3C Cubes.

4.1 CubeViz

CubeViz is an AKSW project that provides a faceted browser and visualization components.

The original project was developed as an OntoWiki addon requiring a PHP backend: see several papers, demo, source , wiki, used at the EU Open Data Portal.
It is currently being rewritten to JavaScript and without the OntoWiki dependency. See demo (but doesn't quite work), source

This shot is from CubeVizJS:

All the other shots are from CubeViz - the OntoWiki component.

Below are two Polar Charts. The second shows data from the EU Digital Agenda scoreboard.

Configuring a Polar Chart:

4.2 OpenCube Toolkit

The OpenCube Toolkit was developed by the OpenCube project. It has developed a number of related solutions for the publishing, exploration and visualization of W3C Cube datasets:

Data Creating

TARQL extension for data cubes: data conversion to RDF from legacy tabular data, such as CSV/TSV files
D2RQ extension for data cubes: data conversion from relational databases into RDF
JSON-stat2qb extension for data cubes: data conversion from JSON-stat files into RDF
R2RML extension for data cubes: data transformation of cubes structured in tabular sources to linked data cubes

Data Expanding

OpenCube Compatibility Explorer: Given an initial cube in the local RDF store of the infrastructure (a) search into the Linked Data Web and identify cubes that are compatible to expand the initial cube, and (b) establish typed links between the local cube and the compatible ones.
OpenCube Aggregator: (a) given an initial cube with n dimensions the aggregator creates 2n−1 new cubes taking into account all the possible combinations of the n dimensions. (b) given an initial cube and a hierarchy of a dimension, the aggregator creates new observations for all the attributes of the hierarchy.
OpenCube Expander: creates a new expanded cube by merging two compatible cubes.

Data Exploring

Data catalogue management: user interface (UI) templates for managing metadata on RDF data cubes and supporting search and discovery
OpenCube Browser: table-based visualizations of RDF data cubes
OpenCube OLAP Browser: it enables performing OLAP operations (e.g. pivot, drill-down, and roll-up) on top of multiple linked data cubes.

R statistical analysis: enables execution of R data analysis scripts from the OpenCube Toolkit, visualization of results or their integration as RDF triples.
Interactive chart visualization widgets: visualization of the RDF data cube slices with charts
OpenCube MapView: map-based visualizations of linked data cubes with a geo-spatial dimension. It enables performing OLAP operations. Supported maps:

Choropleth Map. Each location available in the dataset is represented as a polygon in the visualized map. The polygon changes color density depending on observation's measure value.
Markers Map. Uses markers to represents the observation value for a specific geographic location for a combination of dimension restrictions.
Bubble Map. Each bubble represents the observation value for a specific geographic location for a combination of dimension restrictions. The radius of the bubble indicates the value of the observation's measure.

Several components (the R2RML extension, OpenCube Browser, OpenCube OLAP Browser, MapView) are tightly integrated in fluidOps IWB and it doesn't seem easy to use them outside of this environment.

A number of public showcases are listed, but there is no working demo at present.

4.3 CubesViewer

CubesViewer is an excellent OLAP visualization tool: demo, CubesViewer Studio demo, source, documentation. Below are several examples, gathered from the demo, documentation and twitter.

It is based on the DataBrewery Cubes OLAP framework: source, documentation. Unfortunately this framework does not yet support W3C Cubes: I raised an issue to gauge the interest about such feature.

4.4 qb.js

QB.js allows you to explore data expressed as RDF Data Cubes. It was developed around 2013 by Rensselaer Polytechnic Institute and is implemented in Javascript using D3 and jQuery. The use of these modern technologies makes it a promising alternative, but the project is not maintained.

Source: https://github.com/jpmccu/qb.js
Paperс:

Towards next generation health data exploration: a data cube-based investigation into population statistics for tobacco. JP McCusker, DL McGuinness, J Lee, C Thomas, P Courtney. 46th Hawaii International Conference on System Sciences, 2725-2732, 2013
Exploring Sea Ice Composition Using Semantic Data Dictionaries and qb.js. JP McCusker, RE Duerr, SS Khalsa, PL Pulsifer, MA Parsons, PA Fox. AGU Fall Meeting Abstracts, IN51C-1702, 2012

Example (currently doesn't work): https://rawgit2.com/jpmccu/qb.js/master/examples/tobacco/index.html

A potential alternative by RPI is "WhyIs":

Paper: Whyis 2: An Open Source Framework for Knowledge Graph Development and Research. J McCusker, DL McGuinness. European Semantic Web Conference, 538-554, 2023

5 Graph Visualizations

In the above sections we showed charts of tabular SPARQL results. However, RDF is a graph data model, and often it is useful to see the structure of that graph.

5.1 Builtin Overview Visualizations

GDB WB includes visualizations that allow you a quick overview of a repository. Because these process all data in the repo, they take a while to build and are cached.

Number of relations between classes

Number of classes instances and class hierarchy:

Domain/Range Graph: shows props & classes relating to a given class

5.2 Builtin Graph Visualizations

The Visual Graph display shows the relations of a selected node, and you can expand the graph further. Eg below are the connections of an offshore entity in Mauritius from the Linked Leaks dataset.

Here are the connections of Google from DBpedia. It shows locations, subsidiaries, products and industries; different classes use different colors:

5.3 Developing Graph Visualizations

The GraphDB Development Hub shows examples of Visualizing GraphDB data with Ogma JS. Ogma is a JavaScript library developed by Linkurious. It allows to develop a variety of visualizations with very little effort, as shown below

5.3.1 Flight Routes

5.3.2 Companies and Persons

Relations of Google to people and companies, using FactForge data.

5.3.3 Company Relations

Relations between companies, including offshore entities.

6 Visualization Toolkits

There are numerous powerful and popular visualization tools, creating an amazing variety of graphs and charts, such as:

d3.js, with addons (e.g. interactive selection of chart type)
Tableau Public edition
Microsoft PowerBI
GoJS
Google Charts
Linkurious

Specialized tools also exist that can display charts, graphs, flows and many other visualizations, e.g.

CrossFilter for "faceting" of multidimensional data,
Cubism for viewing time series; CubeViz and OpenCube Toolkit for statistical data
Histropedia for making advanced timelines

Some examples from recent datathons and hackathons

Cluster diagram of public procurement spending through the last 5 Bulgarian cabinets (2011-2016)^[14]. Developed with Tableau

Bubble chart of number of procurements (horizontal axis) and total volume (vertical axis) per contracting authority. Developed with PowerBI

Count of procurements by one contracting authority in time. Allows filtering by government cabinet, and focusing by time interval.

6.1 Declarative Visualization

We have recently submitted a Horizon 2020 BigData Research proposal, a significant part of which is research and design of powerful Declarative Visualization approaches. Our goal is to make a breakthrough in RDF visualization by making it much easier to generate visualizations than is currently possible. We will adopt and extend existing declarative visualization ontologies to select data (leveraging shapes) and specify visualization parameters. Some relevant previous efforts:

VISO^[15] and RVL^[16] are ontologies for declarative visualization. They start from abstract visualization primitives (e.g. subclass hierarchy can be shown either with an upward arrow, or with circle nesting) and describe the fundamentals of visualization well, but do not go far enough to capture the details.

OpenVisko is a framework that allows users to specify what visualizations they want, rather than specifying how they should be generated. It has a declarative query language that supports the answering of "visualization problems"
Fresnel Lenses select aspects of RDF data for visualization
Circles and arrows diagrams using stylesheet rules is an early approach in the Semantic Web activity for simple declarative diagrams using Graphviz, which inspired the tool described in the next section

6.2 RDF by Example

Ontotext's RDF by Example^[17] semantic modeling tool generates diagrams completely declaratively from RDF models (instance diagrams). It uses PlantUML and Graphviz. It allows some "diagram tweaking" with extra RDF triples in the puml: namespace (e.g. puml:arrow puml:up means to render a particular edge upwards.

E.g. below is a semantic model of mapping Dun & Bradstreet company data to the Financial Industry Business Ontology (FIBO).

It includes 152 fields grouped in 32 nodes. It is generated automatically, with just a few extra triples specifying that some arrows should go up/left/right instead of down. We could simplify it by splitting it into several diagrams, but there is a certain symmetry and repeatability that helps understanding. For example the left & right "wings" are addresses (physical and mailing) that have the same structure; top right are 3 "measures" (NetWorth, AnnualSales, ProfitLoss) having essentially the same structure as shown below (note: DnB currency code 20 means USD):

6.3 R2RML Generation

Once you include source info in your model (relational table/join for each node, and field name for each literal/URL), our tool can generate R2RML^[18], which is the W3C standard for RDBMS to RDF transformation. E.g. consider the following model of Exhibitions for the J. Paul Getty Museum:

The node circled in red (representing an Exhibition at a Venue) is expanded to 15 nodes in the generated R2RML transformation, which means huge savings in complexity and maintainability:

Zooming on the left, you can see the R2RML details (rr:predicateObjectMap, rr:template, rr:termType, etc):

R2RML is verbose: on average it requires 3 nodes and 15 statements for every model statement. Writing R2RML by hand creates a lot of opportunities for mistakes and a maintenance nightmare.

Furthermore, R2RML requires semantic experts to develop and is hard to understand by subject-matter experts (museum curators, commodity trade analysts, etc). This creates a knowledge gap between semantic and domain experts: Semantic Modeling tools like RDF by Example help to bridge that gap.

6.4 Data Access API

A lot of the popular visualization tools (e.g. Pentaho, Centrifuge, QlikView, Tableau) are primarily geared towards tabular data and have ODBC/JDBC interfaces.

In order to save the effort of constructing query URLs as described in Invoking SPARQL Queries and saving query results, we can provide a JDBC API to GraphDB. The user feeds SPARQL queries through JDBC (not SQL queries), and SPARQL tabular results are then returned to the tool. We will reuse one of these open source libraries:

Jena JDBC. Supports remote SPARQL servers (e.g. jdbc:jena:remote:query=<server/sparql>)
jdbc2sparql: an older effort
SCON: an older effort.

If the visualization tool does not support JDBC but ODBC, we can use the JDBC-ODBC bridge (sun.jdbc.odbc.JdbcOdbcDriver). See an example of connecting from Java to Excel using ODBC and the JDBC-ODBC bridge.

Data Visualization with GraphDB and WorkBench, page /

[1] S. Ferré. SQUALL: A Controlled Natural Language as Expressive as SPARQL 1.1. Applications of Natural Language to Information Systems (NLDB), 2013, LNCS 7934, p. 114-125. Springer

[2] S. Ferré. SQUALL: a Controlled Natural Language for Querying and Updating RDF Graphs. Controlled Natural Languages (CNL), 2012. LNCS 7427, p. 11-25, Springer.

[3] S. Ferré. SQUALL: a High-Level Language for Querying and Updating the Semantic Web. Research Report PI-1985, IRISA, 2011

[4] S. Ferré. Sparklis: An Expressive Query Builder for SPARQL Endpoints with Guidance in Natural Language, Semantic Web Journal, 2015

[5] Jens Lehmann, Lorenz Bühmann. AutoSPARQL: Let Users Query Your Knowledge Base, ESWC 2011

[6] Christina Unger, Lorenz Bühmann, Jens Lehmann, Axel-Cyrille Ngonga Ngomo, Daniel Gerber, Philipp Cimiano. Template-based question answering over RDF data, WWW 2012

[7] Generating a Large Dataset for Neural Question Answering over the DBpedia Knowledge Base by Ann-Kathrin Hartmann, Tommaso Soru, and Edgard Marx

[8] Neural Machine Translation for Query Construction and Composition by Tommaso Soru, Edgard Marx, André Valdestilhas, Diego Esteves, Diego Moussallem, and Gustavo Publio in ICML Workshop on Neural Abstract Machines & Program Induction (NAMPI v2)

[9] SPARQL as a Foreign Language by Tommaso Soru, Edgard Marx, Diego Moussallem, Gustavo Publio, André Valdestilhas, Diego Esteves, and Ciro Baron Neto in Proceedings of the 13th International Conference on Semantic Systems - SEMANTiCS2017 Posters and Demos

[10] Dana Dannells, Aarne Ranta, Ramona Enache, Mariana Damova, Maria Mateva, Multilingual Online Museum (WP8 Case study: Cultural Heritage). University of Gothenburg and Ontotext. MOLTO Final presentation, May 2013

[11] Dana Dannélls, Mariana Damova, Ramona Enache, Milen Chechev. Multilingual online generation from semantic web ontologies, WWW 2012

[12] Mateva M, Dannélls D, Ranta A, Enache R, Damova M. Multilingual grammar for museum object descriptions. MOLTO project deliverable 8.3, 2013

[13] Mateva M, Dannélls D, Damova M, Ranta A, Enache R. Multilingual access to cultural heritage content on the Semantic Web. ACL LATECH 2013

[14] Iva Delcheva, Yasen Kiprov, Nikolay Petrov, Victor Senderov. Linking Bulgarian Government Open Data with the Trade Register. Sofia Datathon, March 2017. Slideshare, Visualization

[15] Towards RVL: a Declarative Language for Visualizing RDFS/OWL Data

Polowinski, Jan. In Proceedings of the 3rd International Conference on Web Intelligence, Mining and Semantics, 38:1–38:11. WIMS ’13. New York, NY, USA: ACM, 2013. PDF, SLIDES, REPORT, doi:10.1145/2479787.2479825

[16] VISO: A Shared, Formal Knowledge Base as a Foundation for Semi-automatic InfoVis Systems. Polowinski, Jan, und Martin Voigt. In CHI ’13 Extended Abstracts on Human Factors in Computing Systems. CHI WIP ’13. Paris, France: ACM, 2013. PDF, POSTER, doi:10.1145/2468356.2468677

[17] RDF by Example: rdfpuml for True RDF Diagrams, rdf2rml for R2RML Generation. Alexiev, V. In Semantic Web in Libraries 2016 (SWIB 16), Bonn, Germany, November 2016. HTML, Video

[18] R2RML: RDB to RDF Mapping Language. Souripriya Das, Seema Sundara, Richard Cyganiak. W3C Recommendation. 27 Sep 2012