1 of 41

2 of 41

Exploring Re-Ranking Strategies for E-commerce Search

Coen Baars, Arian Stolwijk

26 November 2024

3 of 41

Coen Baars

CTO & Co-founder at Giftomatic

Tech-enthusiast with 20 years of experience in Search Relevance, Web Development, and UX

Rotterdam, the Netherlands

4 of 41

Arian Stolwijk

Head of Engineering at Giftomatic

Amersfoort, the Netherlands

5 of 41

About Giftomatic�

  • Startup founded in 2019

  • Currently developing solutions for gift card providers

  • Our main product is an e-commerce search engine for optimised product search that matches specific gift cards�
  • Our goal is to enhance Gift Card holders’ search experience while increasing gift card providers margins

  • Active in 20+ countries including The Netherlands, Germany, US, UK, Canada, Australia

6 of 41

Today’s Agenda

A “simple” search example

How to improve search results using rerankers

Conclusion

7 of 41

Alice

Alice is a clever and trendy 16-year old teenager who loves experimenting with makeup and perfecting her style. She has a passion for beauty, fashion.

Amsterdam, the Netherlands

8 of 41

Alice's Christmas Wish List

  • Beautyblender
  • Red backpack for school
  • Apple watch
  • Frozen Pyjama
  • Book about horses

9 of 41

Alice's Christmas Wish List

  • Beautyblender
  • Red backpack for school
  • Apple watch
  • Frozen Pyjama
  • Book about horses

10 of 41

  • Keys to success:
    • Understand User Context: Ensure results align with Alice’s personal preferences and needs.
    • Capture Intent: Look beyond the literal query to uncover what Alice truly wants (e.g., distinguishing between "Apple Watch" the brand and a themed design).
    • Provide a Balanced and Diverse Result Set: Present a mix of highly relevant options and diverse choices to keep the selection engaging and comprehensive.

The Elves Quest

  • The Challenge: Find the perfect presents for Alice.

11 of 41

Time for Santa’s little helpers to start their search for Alice!

12 of 41

Part 1

  • Retrieval (BM25/Semantic)
  1. Input: User submits a query (e.g., "Apple Watch").
  2. Retriever: Search engine finds items with exact or partial matches to the query.
  3. Result: Results are ranked based on basic relevance, often ignoring context or intent.

13 of 41

Lexical Search (BM25)

14 of 41

Lexical Search (BM25)

Query: “Red backpack for School”

User-context: “teenager, girl/woman, likes: beauty, fashion, luxury”

Criteria: Create a diverse resultset matching the query and user-context.

Discussion:

Is this what we expected, or do we need to improve?

15 of 41

16 of 41

Semantic Search

Query Vector

[0.12, 5.04, 0.02, 0.93, …, 2.34]

Document Vectors

[0.11, 4.02, 0.00, 1.10, …, 2.54]

[0.0, 99.04, 0.01, 4.93, …, 1.30]

[0.12, 3.52, 0.65, 0.64, …, 9.23]

https://www.sbert.net/examples/applications/semantic-search/README.html

similarity(

query_vector,

doc_vector

)

17 of 41

Semantic Search

18 of 41

Semantic Search

Query: “Red backpack for School”

User context: “teenager, girl/woman, likes: beauty, fashion, luxury”

Criteria: Create a diverse resultset matching the query and user-context.

Discussion:

Is this what we expected, or do we need to improve?

19 of 41

20 of 41

Hybrid search scores

21 of 41

Hybrid Search

Query: “Red backpack for School”

User context: “teenager, girl/woman, likes: beauty, fashion, luxury”

Criteria: Create a diverse resultset matching the query and user-context.

Discussion:

Is this what we expected, or do we need to improve?

22 of 41

Hybrid Search

Query: “Red backpack for School”

User context: “teenager, girl/woman, likes: beauty, fashion, luxury”

Criteria: Create a diverse resultset matching the query and user-context.

Discussion:

Is this what we expected, or do we need to improve?

23 of 41

24 of 41

How to improve search results using re-rankers

Overview of various reranking techniques

25 of 41

Part 2

  • Reranking

{Arian}

  1. Input: User submits a query (e.g., "Apple Watch").
  2. Retriever: Search engine finds items with exact or partial matches to the query.
  3. Result: Results are ranked based on basic relevance, often ignoring context or intent.
  4. Reranker
  5. Enhanced results

26 of 41

Re-ranker Strategies

  • Reciprocal Rank Fusion (RRF)
  • Maximal Marginal Relevance (MMR)
  • Learning To Rank (LTR)
  • Cross Encoders
  • External Rerank API (Cohere/JinaAI API)
  • Large Language Model

27 of 41

Re-ranker Strategies

  • Reciprocal Rank Fusion (RRF)
  • Maximal Marginal Relevance (MMR)
  • Learning To Rank (LTR)
  • Cross Encoders
  • External Rerank API (Cohere/JinaAI API)
  • Large Language Model

28 of 41

Reciprocal Rank Fusion

29 of 41

Reciprocal Rank Fusion

  • Use two retrievers
  • Just the position of each product of each retriever

Doc

Retriever A

Retriever B

Score A

Score B

Total

A

1

5

1/1

1/5

1.2

B

2

4

1/2

1/4

0.75

C

3

3

1/3

1/3

0.5

D

4

1

1/4

1/1

1.25

E

2

0

1/2

0.5

30 of 41

Maximal Marginal Relevance (MMR)

  • Diversity Reranker
  • Iteratively add documents to the selected documents
  • Penalize a candidate document by the maximum similarity of already selected documents

31 of 41

RRF/MMR

Query: “Red backpack for School”

User context: “teenager, girl/woman, likes: beauty, fashion, luxury”

Criteria: Create a diverse resultset matching the query and user-context.

Discussion:

Is this what we expected, or do we need to improve?

32 of 41

33 of 41

Large Language Models

  • LLMs can do everything!?
  • Ask LLM to rerank the results
  • Let’s add some context about Alice!

34 of 41

Large Language Models

35 of 41

Large Language Models

36 of 41

Large Language Model

Query: “Red backpack for School”

User context: “teenager, girl/woman, likes: beauty, fashion, luxury”

Criteria: Create a diverse resultset matching the query and user-context.

Discussion:

Is this what we expected, or do we need to improve?

37 of 41

38 of 41

Other Re-rankers

  • Learning To Rank (LTR)
    • Good at non-textual features
    • You will need to build your own dataset and train your own model
  • Cross Encoders
    • Like bi-encoders, but know about the query context when ranking documents
  • External Reranking API
    • Cohere, JinaAI
    • Easy to use
    • Harder to tune

39 of 41

Re-rankers Spectrum

Simple

Expensive

RRF*

MMR

Learning To Rank*

Cross Encoder*

Rerank API*

LLM

*Supported in elasticsearch, e.g. through 8.16 retrievers

40 of 41

Conclusion

-> Search is difficult

-> Re-rankers are a good way to improve results

-> There is no magical AI solution or a one size fits all

-> Every situation needs a different solution

41 of 41

Thank you for listening!

Time for questions

We are hiring!

Coen Baars

Arian Stolwijk