1 of 16

Shallow Representation Learning

Node Embedding

2 of 16

Unit Objectives

  • Introducing node embedding problem
  • Embedding in simple graphs
  • Embedding in complex graphs

3 of 16

Node Embedding

  • Node Embedding
    • Low dimensional vector representation of a node
    • Preserves some properties of the node in original graph
  • Node in two domains
    • Graph domain: nodes and their connections
    • Embedding domain: node as a continuous vector
  • Node embedding as mapping
    • Mapping a node from the graph domain to the embedding domain

4 of 16

Embedding Methods: Intuition

A good node representation should be able to reconstruct information that is desired to be preserved

mapping

Extractor

Reconstructor

 

 

Objective: Reconstruction Loss

5 of 16

Node Embedding: Visual Example

Input

Output

Zachary’s Karate Club Graph

6 of 16

Graph Embedding

Simple Graphs

Node Co-occurrence

Structural Role

Node status

Community Structure

Complex Graphs

Heterogeneous Graphs

Bipartite Graphs

Multi-dimensional Graphs

Signed Graphs

Hypergraphs

Dynamic Graphs

7 of 16

Embedding Simple Graphs

Node Co-occurrence Methods

8 of 16

Notions of Co-occurrence

… if the degree distribution of a connected graph follows a power law, we observe that the frequency which vertices appear in the short random walks will also follow a power-law distribution …

Language

Co-occurrence is the basis of fundamental representations in NLP: Vector representation of words or neural language models

9 of 16

Notion of Co-Occurrence

Source: School of Information, Pratt Institute

Co-Authorship Network

An author is connected to limited set of authors

This can be used as a notion of Co-occurrence

Can we use learn node representations that preserves the co-occurrence relations in the network?

10 of 16

Language Models

  •  

11 of 16

Word Frequency in Natural Language

Word frequency in natural language follows a power law

Slide from Bryan Perozzi et al.

 

 

The second most used word appears half as often as the most used word.

The third most used word appears one-third the number of times the most used word appears, and so on

12 of 16

Connection: Language and Graphs

Scale Free Graph

Artificial Language: Short truncated random walks as sentences

Random Walk on Graph

 

Short random walks

Vertex frequency in random walks on

scale free graphs also follows a power

law.

13 of 16

Connection: Language and Graph

  • Learning Word Representation
    • Learn a dense representation of words so that the word co-occurrence information is preserved
  • Learning Node Representation
    • Learn dense representation of the nodes in the graph such that node neighborhood information (community structure) in the graph is preserved

14 of 16

Language Model in Concrete Terms

  •  

15 of 16

The Problems

  •  

 

16 of 16

The Hope

  • Solving the optimization problem builds representations that capture the shared similarities in local graph structure between vertices
  • Vertices which have similar neighborhoods will acquire similar representations (encoding co-citation similarity)