Using ML For Personalizing Food Search

About Go-jek

Transport

Payments

Food Delivery

Shipments

...18+ products

About Go-Food

Category Search

Dish Search

Text Search

Search Architecture

Search Request

Metrics

Google Dataproc

Batch Jobs

Analytics

Why don’t we use historical data in real time for personalized search?

Data Points

Favorite Restaurants

Preferred Cuisines

Restaurant Menu visits

Order size

Recent Orders

Enriched User Data

Old document in elasticsearch

Enriched document in elasticsearch

Using Enriched Data

Booking conversion depends on multiple factors:

    • Match of the restaurant to the user search
    • Match of the restaurant to the customer’s taste
    • ...etc

To understand, HOW and HOW MUCH each of these factors contribute to is important to serve the right results to our customers

Here we use ML for ranking the documents on relevant factors

ML in elasticsearch

Elasticsearch natively doesn’t provide support for running non linear model for ranking documents.
Additional Setup required to run ML model in elasticsearch

Two approaches:
1. Rank documents out of elasticsearch

2. Use plugin that supports Non linear model in elasticsearch

Rank documents outside elasticsearch

Fetch top documents from elasticsearch based on default query

Run ML model on these fetched documents externally using PMML(Predictive Model Markup Language)


Our Findings:

Response times were high because of fetching multiple documents

Plugin to rank documents in elasticsearch

LearnToRank plugin provides support for running ML models in elasticsearch.

Support for running multiple models.

  • XGBoost
  • LamdaMART

Our Findings:

Response times were within expectations

Can run multiple models at the same time

We went with using LearnToRank plugin

Getting started with LearnToRank

Installing LearnToRank:

Add this to search query

Response contains metrics that LearnToRank creates

Our approach with LearnToRank

  • Run multiple models for each search category at the same time
  • Rollout to a certain percentage of search traffic
  • Collect metrics provided by LearnToRank to track performance of model
  • Collect booking metrics to track the business performance


Impartial Rollout

Same user should be shown personalized and non personalized results to track which drives better conversion

Rollout Architecture

Search Request

30 minutes TTL for keys

Select random model and query

30 minute TTL ensures that same user will be shown both personalized and non personalized results

Training Model

We Track these data points:

LearnToRank Metrics
Booking metrics
Search Select and Search metrics

Join these data points and this serves as training data for the model.

Creating model from training data:
https://github.com/lezzago/LambdaMart

Results of Personalized Search

23% increase in Booking Conversion Rate

14% increase in Click Conversion Rate

Next Steps

  • Query understanding(use similar terms)
  • Query expansion
  • Leveraging the food ontology during search

Questions?

Using ML For Personalizing Food Search - Google Slides