Sequence aware Reinforcement Learning over Knowledge Graphs
Ashish Gupta Rishabh Mehrotra
Sr Statistical Analyst, Walmart Labs Sr Research Scientist, Spotify Research
Bangalore, India London, UK
Let’s first contextualize!
Last day
Last session
Last talk
Knowledge Graphs are useful
Knowledge represented as entities, edges and attributes
Knowledge Graphs are useful
Knowledge represented as entities, edges and attributes
KGs are flexible
easy to integrate heterogeneous information
Large scale KGs in industry!
Recommendations over a KG
WWW, SIGIR tutorial: ExplainAble Recommendation and Search (EARS)�https://sites.google.com/view/ears-tutorial/
Recommendations over a KG
WWW, SIGIR tutorial: ExplainAble Recommendation and Search (EARS)�https://sites.google.com/view/ears-tutorial/
Recommendations over a KG
WWW, SIGIR tutorial: ExplainAble Recommendation and Search (EARS)�https://sites.google.com/view/ears-tutorial/
Recommendations over a KG
WWW, SIGIR tutorial: ExplainAble Recommendation and Search (EARS)�https://sites.google.com/view/ears-tutorial/
Recommendations over a KG
WWW, SIGIR tutorial: ExplainAble Recommendation and Search (EARS)�https://sites.google.com/view/ears-tutorial/
SeqReLG: Sequence-aware RL over Graphs
Outline
Product Knowledge Graphs
Product Knowledge Graphs
Paths in a Knowledge Graph
Reasoning path = how the agent reached the item from the user
Xian, Yikun, et al. "Reinforcement Knowledge Graph Reasoning for Explainable Recommendation." SIGIR 2019
Outline
Knowledge Graph Reasoning (PGPR)
Step 1: Build Knowledge Graph
Xian, Yikun, et al. "Reinforcement Knowledge Graph Reasoning for Explainable Recommendation." SIGIR 2019
Knowledge Graph Reasoning (PGPR)
Step 1: Build Knowledge Graph
Step 2: Learn embeddings
Xian, Yikun, et al. "Reinforcement Knowledge Graph Reasoning for Explainable Recommendation." SIGIR 2019
Knowledge Graph Reasoning (PGPR)
Step 1: Build Knowledge Graph
Step 2: Learn embeddings
Xian, Yikun, et al. "Reinforcement Knowledge Graph Reasoning for Explainable Recommendation." SIGIR 2019
Knowledge Graph Reasoning (PGPR)
Step 1: Build Knowledge Graph
Step 2: Learn embeddings
Xian, Yikun, et al. "Reinforcement Knowledge Graph Reasoning for Explainable Recommendation." SIGIR 2019
Knowledge Graph Reasoning (PGPR)
Step 1: Build Knowledge Graph
Step 2: Learn embeddings
Step 3: Train RL model: policy/value network
Xian, Yikun, et al. "Reinforcement Knowledge Graph Reasoning for Explainable Recommendation." SIGIR 2019
Knowledge Graph Reasoning (PGPR)
Step 1: Build Knowledge Graph
Step 2: Learn embeddings
Step 3: Train RL model: policy/value network
Step 4: Serve recommendations using trained policy network
Xian, Yikun, et al. "Reinforcement Knowledge Graph Reasoning for Explainable Recommendation." SIGIR 2019
Outline
SeqReLG: Sequence-aware RL over Graphs
SeqReLG: Advancements
RQ1: Can we train better embeddings?
RQ2: How important is considering item sequences in policy/value network?
RQ3: Can we incorporate look-ahead items while evaluating actions?
SeqReLG: Advancements
RQ1: Train better embeddings
SeqReLG: Advancements
RQ1: Train better embeddings
SeqReLG: Advancements
RQ2: Sequence-aware policy/value network
Reinforcement Learning setup:
SeqReLG: Advancements
Policy / value network
SeqReLG: Advancements
Sequence of states
SeqReLG: Advancements
Sequence of states
Sequence important for Multi-Objective modeling
SeqReLG: Advancements
RQ3: Incorporating look-ahead items
SeqReLG: Advancements
RQ3: Incorporating look-ahead items
SeqReLG: Advancements
RQ3: Incorporating look-ahead items
Outline
Results - I
No change in RL model� → Only embedding training� phase changed
~2% improvement across all metrics
Results - I
No change in RL model� → Only embedding training� phase changed
~2% improvement across all metrics
Better trained embeddings are useful!
Results - II
Added sequence of items
~4% improvement in hit rate
Improvements across all metrics
Results - II
Added sequence of items
~4% improvement in hit rate
Improvements across all metrics
Sequence information of items in the path is helpful!
Results - III
RL model with look-ahead
~9% improvement in precision
Improvements across all metrics
Knowing where the path is headed helps!
Ongoing Work
Ongoing Work
Ongoing Work
Stakeholders
Ongoing Work
Stakeholders
Multi-objective RL:
Ongoing Work
Stakeholders
Multi-objective RL:
Ongoing Work
Stakeholders
Multi-objective RL:
Ongoing Work
Stakeholders
Multi-objective RL:
Thank You!
Summary:
Ashish Gupta
Rishabh Mehrotra
On-going work: