Building tomorrow’s Products
Ammar Jawad,
Product Manager, Personalisation & ML Platform
Hotels.com at Expedia Group
1
Agenda
2
Introduction to Supervised Learning - I
3
What we know
What we want to know
Supervised Learning
Data around patients cancelling & showing up to medical appointments
Probability that a new patient will cancel their appointment.
Introduction to Supervised Learning - II
4
Data on patients cancelling medical appointments
Probability that a new patient will cancel their appointment.
Understanding Supervised Learning - I
5
Target labels
Understanding Supervised Learning - II
6
Training data
Testing data
Understanding Supervised Learning - III
7
Hide labels from the machine, but we know the values.
Understanding Supervised Learning - III
8
“Able to predict whether a patient will show up with 80% accuracy.”
Accuracy Paradox
9
Accuracy Paradox - Not all mistakes are equal
- A Medical example
In this scenario we want to reduce false negatives (aka ‘recall’) as it is more dangerous to send home a sick patient without treatment than sending a healthy patient for more checks.
10
| Diagnosed Sick | Diagnosed Healthy |
Actually Sick | True Positive, i.e. diagnosed sick and is sick. | False Negative, i.e. patient is sick and model diagnoses them as healthy and sends them home. |
Actually Healthy | False Positive, i.e. patient is healthy but diagnosed sick. | True Negative, i.e. diagnosed healthy and sent home and is healthy. |
Accuracy Paradox - Not all mistakes are equal
- A Spam example
In this scenario we want to reduce false positives (aka ‘precision’) as it’s better to receive spam emails in your inbox compared to not receiving an important email in your inbox which ends up in your spam folder.
11
| Sent to Spam folder | Sent to Inbox |
Is Spam | True Positive, i.e. email is spam and sent to spam folder. | False Negative, i.e. email is sent to inbox but is spam. |
Not Spam | False Positive, i.e. email is not spam but lands in spam folder. | True Negative, i.e. email is not Spam and correctly sent to inbox. |
Let’s look at a real ML model
(Go to Github)
12
Reinforcement Learning
13
A/B/n testing
14
Buy now
ADD TO CART
PAY NOW
CLICK & SEE WHAT HAPPENS
All visitors
25% traffic
25% traffic
25% traffic
25% traffic
11.4% CTR
7.1% CTR
3.4% CTR
1.4% CTR
WINNER FOUND!
Reinforcement Learning in Experimentation
15
Action
Reward
Action
Reward
Context
Multi-Armed Bandits
Contextual Bandits
Multi-Armed Bandits
Online decision making
Best long-term strategy may involve short-term sacrifices to maximise long-term gain
Other examples include:
16
Multi-Armed Bandits - Use-cases
Whenever we don’t know the right numerical answer or we don’t know what the sweet spot is, bandits are ideal:
Multi-Armed bandits is effectively an approach to experimentation using machine learning.
17
Multi-Armed Bandits - Trade-off
18
Multi-Armed Bandits - Exploration
Three main approaches to exploration:
Explore based on a probability to take a random action, e.g. explore 20% of the time.
�2. Optimism in the face of uncertainty
When faced with options for which we know the value of each except one action which value is unknown then there is a bias towards the action with an unknown outcome.
3. Information state space
Consider agent’s information as part of its state
Look ahead to see how information helps reward
19
Multi-Armed Bandits
20
Buy now
ADD TO CART
PAY NOW
CLICK & SEE WHAT HAPPENS
All visitors
11.4% CTR
7.1% CTR
3.4% CTR
1.4% CTR
70% traffic
20% traffic
7% traffic
3% traffic
25% traffic
25% traffic
25% traffic
25% traffic
Next 1000
Multi-Armed Bandits - Same Assumptions as A/B testing
21
Contextual Bandits I
22
Buy now
ADD TO CART
PAY NOW
North America | |
Europe | |
Asia | |
Africa | |
PAY NOW
Buy now
ADD TO CART
Morning | |
Noon | |
Evening | |
Night | |
Buy now
ADD TO CART
Buy now
Solo | |
Family | |
Romance | |
Business | |
CLICK & SEE WHAT HAPPENS
PAY NOW
Region Winner
Time of Day Winner
Customer Type Winner
CLICK & SEE WHAT HAPPENS
Contextual Bandits II
23
North America | Night | Family | |
Asia | Morning | Business | |
Europe | Evening | Solo | |
PAY NOW
ADD TO CART
Buy now
Winner
States
Contextual Bandits
24
Challenging Experimentation Assumptions
25
Opportunity Cost in Experimentation
26
Continuous Exploration
27
ML Opportunity Framework
Helps product evaluate whether a use-case:
28
ML Framework - Step 1
29
ML Framework - Step 2
30
Contact details
31