1 of 9

Metagraph2vec

Complex Semantic Path Augmented Heterogeneous Network Embedding

Code Review

BEST FOR You�ORGANICS COMPANY

2 of 9

Methodologies

A new method for heterogeneous network embedding that learns more informative embeddings by capturing richer semantic relations between distant nodes.

  • Present metagraph-guided random walk to generate heterogeneous neighborhood in an HIN.
  • Present the MetaGraph2Vec learning strategy to learn latent embeddings of multiple types of nodes.

2

BEST FOR You�ORGANICS COMPANY

3 of 9

Dataset

Some popular datasets for HIN:

  • Acm:
    • ~ 7M words.
    • Total embeddings: 11246
  • yelp:
    • ~ 14M words.
    • Total embeddings: 37791

3

BEST FOR You�ORGANICS COMPANY

4 of 9

Algorithm

4

BEST FOR You�ORGANICS COMPANY

5 of 9

Generate heterogeneous neighborhood

Random walk

  • It firstly counts the number of edge types satisfying the constraints and randomly selects one qualified edge type.
  • Then it randomly walks across one edge of the selected edge type to the next node.
  • If there are no qualified edge types, the random walk would terminate.

5

BEST FOR You�ORGANICS COMPANY

6 of 9

MetaGraph2Vec Embedding Learning

Skipgram

  • To learn node embeddings, the MetaGraph2Vec algorithm first generates a set of metagraph guided random walks
  • Then counts the occurrence frequency F(vi , vj ) of each node context pair (vi , vj ) within w window size.
  • After that, stochastic gradient descent is used to learn the parameters

6

BEST FOR You�ORGANICS COMPANY

7 of 9

Training

Training for each epoch

  • SGD optimization.
  • Training with sample batches.
  • Negative sampling.

7

BEST FOR You�ORGANICS COMPANY

8 of 9

Negative sampling

To speed up training, negative sampling is used to approximate the objective function.

8

BEST FOR You�ORGANICS COMPANY

9 of 9

Thank You

BEST FOR You�ORGANICS COMPANY