1 of 9

Complex Embeddings for Simple Link Prediction

Theo Trouillon

Johannes Welbl

Sebastian Riedel

Eric Gaussier

Guillaume Bouchard

Jens Martensson

2 of 9

Introduction

How to leverage the KG for link prediction?

+ KBs express data as a directed graph with labeled edges (relations) between nodes (entities).

+ The missing items of a KB may be filled in by using natural redundancies among the reported relations.

+ The goal of link prediction is the automatic discovery of relationships between entities in the graph.

2

Jens Martensson

3 of 9

The proposed method

3

 

Jens Martensson

4 of 9

Modelling Relations

4

+ A relation between two entities is represented as a binary value Yso ∈ {−1, 1}, where s ∈ E is the subject of the relation and o ∈ E its object. Its probability is given by the logistic inverse link function:

+ Goal: find a generic structure for X that leads to a flexible approximation of common relations in real world KBs.

+ Generalised the notion of dot products to scoring functions, also known as composition functions, that combine embeddings in specific ways(Eigenvalue decomposition):

+ Eigenvalue decomposition is not possible in the real space; there only exists a decomposition in the complex space where embeddings are composed of a real vector component Re(x) and an imaginary vector component Im(x).

+ For a given entity, its subject embedding vector is the complex conjugate of its object embedding vector.

Jens Martensson

5 of 9

Low-Rank Decomposition

5

+ To enable the model to be learnable, i.e. to generalize to unobserved links, some regularity assumptions are needed.

+ Binary relations have a low sign-rank. The smallest rank of a real matrix with the same sign-pattern as Y is the sign-rank of a sign matrix.

+ By imposing a low-rank, a relation scores between entities s and t will be:

+ Three points when using low-rank decomposition:

(1) new factorization encompasses all possible binary relations.,

(2) By construction, it accurately describes both symmetric and antisymmetric relations

(3) Learnable relations can be efficiently approximated by a simple low-rank factorization, using complex numbers to represent the latent factors..

Jens Martensson

6 of 9

Application to Binary Multi-Relational Data

6

+ Depending on the scoring function φ(s, r, o; Θ) used to predict the entries of the tensor X, the model has been obtained the scoring function as follows:

+ Equation (10) show that by using the complex conjugate of one of the embeddings, model can handle asymmetry in the real vectors.

+ Equation (11) only involves real vectors corresponding to the real and imaginary parts of the embeddings and relations.

Jens Martensson

7 of 9

Code

7

Jens Martensson

8 of 9

Conclusions

8

+ A simple approach to matrix and tensor factorization for link prediction data that uses vectors with complex values and retains the mathematical definition of the dot product.

+ This work has several potential directions for expansion:

    • One obvious choice is to combine our method with well-known tensor factorization extensions to further enhance prediction performance.
    • To produce more insightful negatives with regard to the positive sample from which they were collected, a more intelligent negative sampling process might be developed.

Jens Martensson

9 of 9

Thank You

Jens Martensson