1 of 32

Graph Mining in Recommender Systems

  • - A Tutorial at WISE 2021

  • Speakers:
  • Dr Hongxu Chen
  • Miss Yicong Li
  • Mr Haoran Yang

University of Technology of Sydney (UTS)

2 of 32

Agenda

  • Graph Representation Learning and its applications in Recommendation Systems

by Hongxu Chen, 14:15 – 14:45 October 27, 2021 (Wednesday)

  • Graph-based Explainable Recommender Systems

by Yicong Li, 14:45 – 15:15 October 27, 2021 (Wednesday)

  • Graph Contrastive Learning for Recommender Systems

by Haoran Yang, 15:15 – 15:45 October 27, 2021 (Wednesday)

3 of 32

1. Graph Representation Learning and its applications in Recommendation Systems

4 of 32

Networks are ubiquitous

5 of 32

Representing networks by vectors

  • Graph Representation Learning.
    • Also known as Graph Embedding or Network Embedding.
    • Low-dimensional vector for vertices.
    • Effectively preserve network structure.

    • Downstream data mining tasks on graphs:
      • Link prediction.
      • Node classification.
      • Recommendation.

6 of 32

Graph Embedding Methods

  • Early Methods (Matrix Factorization-based)
    • E.g., MDS, IsoMap, Spectral Clustering, Laplacian Eigenmap, etc.
  • Random-walk based Embedding Methods
      • E.g., DeepWalk, Node2Vec, Metapath2Vec, etc.
  • Graph Neural Networks
      • E.g., GCNs, SAGE, GAT, Graph Autoencoder, etc.

Handy Materials:

[1]. http://web.stanford.edu/class/cs224w/

7 of 32

Recommendation Systems as Graphs

https://images.app.goo.gl/YjzjovQnvZgwwdx56

8 of 32

As Bipartite Graph Mining

  • Matrix Factorization
  • Factorization Machines
  • Bayesian Analysis (PMF, BPR)
  • Deep Learning (NCF, DMF, DeepFM)

9 of 32

Timeline of development of Bipartite Modelling Approaches

Deng, Y. (2021). Recommender systems based on graph embedding techniques: A comprehensive review. arXiv preprint arXiv:2109.09587.

10 of 32

As General Graph Modelling

  • Trans Family (e.g., TransE, TransR, TransH, TransG, etc)
  • Random Walk and Meta-path
  • Deep Learning

11 of 32

Trans Family�

  • Model triplets (h,r,t) as basic elements in a graph.

For example:

      • TransE models a relation r as a translation from head entity h to tail entity t.
      • TransH maps each pair of (h, t) to multiple relation-specific hyper-planes distinguishing the diverse relations between nodes.
      • TransR maps nodes and relations separately into a node space and multiple different relation spaces corresponding to the diverse relations between head-tail pairs, respectively.

12 of 32

Random Walk and Meta-path

  • Meta-path
  • Meta-level description of a path between two objects (Sun, et al.).
  • Different Meta-Paths tell different Semantics.

13 of 32

Random Walk and Meta-path

  • Random Walk as paths.
  • Encode the paths as sentences (Inspired by advances in NLP).
  • Word2Vec: Skip-gram and Continuous Bag of Words (CBOW)
  • Negative Sampling

  • Graphs Embedding techniques recap:
  • LINE
  • Randwalk
  • Deepwalk
  • Node2Vec
  • Metapath2Vec
  • BINE (can also classified as Bipartite Network Embedding based method)

14 of 32

Deep Learning based approaches

Autoencoders

SDAE

Transformer

GCNs

SAGE

GAT

15 of 32

Timeline of General Graph Embedding based Approach

Deng, Y. (2021). Recommender systems based on graph embedding techniques: A comprehensive review. arXiv preprint arXiv:2109.09587.

16 of 32

Where are we in embedding spaces? A Comprehensive Analysis on Network Embedding Approaches for Recommender Systems

  • Latent Space Models
    • Most are built in Euclidean space.
      • Suffer from high distortion.
      • Need to increase the latent size in order to reduce the distortion.
      • Resources needed to train and store the model also increases.

  • Hyperbolic Space:
    • A promising new latent space to solve the above issues.
      • Expands faster than Euclidean space.
      • Preserve the hierarchies of data.

As part of Australian ARC Founded Discovery Project - Dynamics and Control of Complex Social Networks (DynaCo)

17 of 32

Where are we in embedding spaces? A Comprehensive Analysis on Network Embedding Approaches for Recommender Systems

  • Issues for Existing Literatures on Hyperbolic Space:
    • Practitioners are indiscriminately attempting to transfer variants of existing algorithms into hyperbolic geometry in regardless of having strong motivations.
    • Analysis and guidance about when to use hyperbolic space are missing.

  • Our Contributions:
    • We provide theoretical analysis and empirical results to validate our three hypotheses about the effectiveness of hyperbolic space.
    • We address the drawbacks of hyperbolic space, and give comments on when and where to use hyperbolic space.
    • We propose a metric learning based social recommendation method SCML and its hyperbolic version HSCML and validate our hypotheses on them. We also show that hyperbolic space has state-of-the-art performance by comparing HSCML with other baselines on two benchmark datasets.

As part of Australian ARC Founded Discovery Project - Dynamics and Control of Complex Social Networks (DynaCo)

18 of 32

Where are we in embedding spaces? A Comprehensive Analysis on Network Embedding Approaches for Recommender Systems

  • Hypotheses
    • Distance models are more suited for learning hyperbolic embeddings than projection models.
      • The distribution of embeddings is different after convergence.
      • Embeddings in projection models gather around the origin, making little use of the whole space.
      • Distance models can learn hierarchical information easier.

    • Hyperbolic space is more powerful compared to Euclidean space when the density of dataset is small.
      • Hyperbolic space can boost learning using the extracted hierarchical information.

    • The performance of hyperbolic space is better than Euclidean space when the latent size is small, but as the latent size increases, the performance of Euclidean space becomes comparable with hyperbolic space.
      • The latent size needed to embed certain amount of information is smaller for hyperbolic space.
      • With large enough latent size, Euclidean space will also have a very small distortion.

As part of Australian ARC Founded Discovery Project - Dynamics and Control of Complex Social Networks (DynaCo)

19 of 32

Where are we in embedding spaces? A Comprehensive Analysis on Network Embedding Approaches for Recommender Systems

  • Experiments (General Item Recommendation)
    • High Density Datasets vs. Low Density Datasets

As part of Australian ARC Founded Discovery Project - Dynamics and Control of Complex Social Networks (DynaCo)

20 of 32

Where are we in embedding spaces? A Comprehensive Analysis on Network Embedding Approaches for Recommender Systems

  • Experiments (General Item Recommendation)
    • Influence of the Latent Size

As part of Australian ARC Founded Discovery Project - Dynamics and Control of Complex Social Networks (DynaCo)

21 of 32

Where are we in embedding spaces? A Comprehensive Analysis on Network Embedding Approaches for Recommender Systems

  • Experiments (General Item Recommendation)
    • Distance Model vs. Projection Model

As part of Australian ARC Founded Discovery Project - Dynamics and Control of Complex Social Networks (DynaCo)

22 of 32

Where are we in embedding spaces? A Comprehensive Analysis on Network Embedding Approaches for Recommender Systems

  • Experiments (General Item Recommendation)
    • Distance Model vs. Projection Model

As part of Australian ARC Founded Discovery Project - Dynamics and Control of Complex Social Networks (DynaCo)

23 of 32

Where are we in embedding spaces? A Comprehensive Analysis on Network Embedding Approaches for Recommender Systems

  • Drawbacks of Hyperbolic Space
    • High computational complexity.
    • Invalid values.

  • Comments on Using Hyperbolic Space
    • Distance models are more suited for learning hyperbolic embeddings than projection models.
    • If the density of the dataset is large, Euclidean space should be a better choice because it has a comparable performance with hyperbolic space and the computational complexity is low; when the density is low, hyperbolic space is more preferable because it will have a much better performance than Euclidean space.
    • Choose an appropriate latent size. In most cases, hyperbolic embeddings only need a relatively small latent size to achieve good performance, which can help to save resources.

As part of Australian ARC Founded Discovery Project - Dynamics and Control of Complex Social Networks (DynaCo)

24 of 32

Others and Challenges

Three ground challenges:

  • Heterogeneity
  • Scalability
  • Dynamics

25 of 32

Timeline of key developments regrading the challenges

Deng, Y. (2021). Recommender systems based on graph embedding techniques: A comprehensive review. arXiv preprint arXiv:2109.09587.

26 of 32

Heterogeneity

The Real World: Heterogenous Networks

    • Multiple object types.
    • Multiple link types.

E.g., vertex u1 is close to both b2 and u3, but these relationships have different semantics. b2 is a business visited by user u1, while u3 is a friend of u1.

Homogenous Networks

    • Single-Typed Nodes.
    • Single-Typed Links.

27 of 32

Projected Metric Embedding (KDD18)

      • Dot product is used to compute the proximity between different types of nodes
      • Not a metric based distance
      • Violates the crucial triangle inequality

Therefore, existing HIN embedding methods (e.g., Metapath2vec and EOE)

        • Can only capture local structures (both A and B are close to C)
        • But fail to capture the second-order proximities. (A and B are also close)

      • Node A is close to Node C
      • Node B is close to Node C

Node A is also close to Node B

  • Previous attempts.

To model semantic-specific relationships:

      • Metapath2vec (Dong et. al., KDD 2017)
      • EOE (Xu et. al., WSDM 2017)

28 of 32

Projected Metric Embedding (KDD18)

Our proposed method

    • Projected Metric Embedding for HIN
      • Simultaneously preserves:
        • The first-order proximities between nodes
        • And the second-order proximities between nodes

Such that, in the latent space,

29 of 32

Projected Metric Embedding (KDD18)

  • However, directly applying the Euclidean distance as a metric will be problematic !
    • Mathematically,  It is geometrically restrictive and an ill-posed algebraic system.
    • On the other hand, one object may have multiple aspects. 

  • To address these issues:
    • PME introduces relation-specific projection embedding matrices.
    • Model objects and relations in distinct spaces.
      • One shared object space.
      • Multiple relation spaces. (relation-specific object spaces).

Hence, it is possible that some objects are far away from each other in the object space, but are close to each other in the corresponding relation spaces. 

30 of 32

Projected Metric Embedding (KDD18)

  • Decompose the HIN to link-specific sub-networks.
  • Metric learning on each link-specific space.

  • Learn shared node embeddings and relation-specific space.

31 of 32

Projected Metric Embedding (KDD18)

Model optimization:

    • Bi-directional Negative Sampling Strategy

    • Directly optimizing this equation is expensive!

Inspired by the negative sampling techniques

first fix vertex vi and edge type r, then generate K negative vertices vk 

then fix right side of eijr , and sample K negative vertex from the left side 

32 of 32

Projected Metric Embedding (KDD18)

Model optimization:

    • Loss-aware Adaptive Positive Sampling Strategy

    • A sequence of the losses for each sub-network.

    • Simply calculate the sum of the losses.

    • Draw a random value within the range of [0,1].

    • To see which  interval the random fails into.