JavaScript isn't enabled in your browser, so this file can't be opened. Enable and reload.

1 of 14

Knowledge Architecture for Organisations

2 of 14

Knowledge Architecture for Organisations - Overview

Earlier: Fundamentals on ontologies, schemas, RDF, etc.
Now: How can we build an architecture for utilizing these tools in real-life?

We will explore this by looking at an Abstract Reference Architecture (ARA) for knowledge architecture

Why? It is impossible to propose a single architecture that fits all use-cases of knowledge graphs

Focus of architecture, details in later chapters

3 of 14

Architectural overview

The book identifies the architecture as having three main layers:

Knowledge Acquisition and Integration Layer

Creating the graph

Knowledge Storage layer

Storing the graph

Knowledge Consumption layer

Using the graph

4 of 14

Architectural overview

5 of 14

Acquisition and Integration Layer

Ontology Development

Choosing a vocabulary

Lightweight vs Heavyweight

Lightweight: Without formal definitions = Easy to build
Heavyweight: Includes formal definitions = Harder to build

If you choose a heavyweight vocabulary the books suggests using a methodology

Defines steps to carry when developing the ontology
Suggests methods/tools to carry out these steps
Example include METHONTOLOGY, Diligent, HCOME

6 of 14

Acquisition and Integration Layer

Text Integration

Text is usually integrated into the knowledge graph by two methods:

Named Entity Resolution

Involves extracting mentions of an entity in the knowledge graph from the text
Example:�“Tim Cook took over as CEO of Apple”

Thematic Scope Resolution

Figure out what the text is actually talking about
Different from NER: Extracting the mentions not enough

7 of 14

Knowledge Storing and Accessing Layer

Overview

Choosing a format for storing your ontological data

Organizations often store their data in “silos”

Ie. many different systems that are not interconnected

Integrating an organization’s solos is known as the data integration problem
The book introduces three ways of storing ontological information:

Ontology-Based Data Access (OBDA)
RDF Stores
Property Graph-Based Stores

8 of 14

Knowledge Storing and Accessing Layer

Ontology-Based Data Access (OBDA)

In OBDA we store the data in their original databases
Implementation:

Separate data (ABox) and semantics (TBox)
TBox keeps track of the conceptual model
ABox extends the conceptual model into�The data sources
ABox can be virtual or materialized

Virtual retrieves instances directly�from data sources
Materialized retrieves instances from�A triple store that is updated with �data from data sources

9 of 14

Knowledge Storing and Accessing Layer

RDF Stores

RDF Stores are database systems specifically created to store triples (Subject, Predicate, Object)
Built specifically for “data volume, bulk loading speed and query answering efficiencies”
There are many database techniques that can be used to implement a RDF Store:

Relational databases
Native triple stores
Graph Databases
NoSQL
Etc.

What is the difference between RDF Stores and NoSQL databases?

NoSQL are popular for storing RDF data, however they lack some benefits of RDF Stores
NoSQL databases are built for very lightweight schemas, RDF Stores can handle both light and heavier schemas
RDF Stores are based on W3C standards and built for using the web as a platform

10 of 14

Knowledge Storing and Accessing Layer

Property Graph-Based Stores

A property graph is simply a graph where nodes and edges can have multiple properties
Can be used to store RDF graphs
Similarly RDF graphs can be made to store property graphs
Representing (schemaless) RDF graphs in property graphs:

Properties of edges must be labels
Properties of nodes expressed as triples with {subject, key, value}

11 of 14

Knowledge Storing and Accessing Layer

Comparison

OBDA

Lightweight, doesn’t require creating new data structures
May not be suited for data intensive tasks

Relational Databases

Schema requirement means system always knows the structure of the graph
Hard to change or add to schema after it has been defined
Expensive joins when exploring large parts of graph

RDF Stores / Property Graph

Schemaless means easy to add new properties / change the schema
Cheap exploration of graph

12 of 14

Knowledge Consumption Layer

Semantic Search

Traditional search

Indexing documents
Query documents

Semantic search

Extends traditional search, but provides additional benefits
During indexing it can disambiguate entities
During search it can use help disambiguate user queries

Did you mean “Apple (Company)” or “Apple (Fruit)”?

13 of 14

Knowledge Consumption Layer

Summarization

Knowledge graphs can be summarized to give an overview of the information they contain
Entity summary

Summary of single entity in graph
Entity card

Graph summary

Summary of whole or part of graph

Goal-driven Graph Profiling

Custom summary of nodes relevant to some task

Graph Analytics

Finding interesting patterns in the graph

14 of 14

Knowledge Consumption Layer

Query Generation and Answering

Query generation

Users aren’t always familiar with the graph and/or tools they are using
Help users create the queries they want by helping the user understand the content/structure of the graph

Query answering

Traditional IR systems only return lists of documents as answers to query
Using a knowledge graph we can more directly answer a user query
“How old is Tim Cook?”

Traditional IR: Here is a list of documents containing the text “How old is Tim Cook”
With knowledge graph: “Tim Cook is 42 years old”