1 of 48

Machine learning in action

How we doubled revenue on a game

with over a billion players

Robert Magyar

5.11.2018

2 of 48

Hello!

I am Robert Magyar

Data scientist/Analyst @ Cellense

You can find me at robert.magyar@cellense.com

2

Data Science Club 2018

3 of 48

Agenda

  1. Introduction
  2. Approach to personalization
  3. Case study: Hill Climb Racing 2 Special offers
  4. Learnings and next steps
  5. Q&A

Data Science Club 2018

4 of 48

  • Analytics + Monetization (LiveOps) + User Acquisition
  • Founded by Ivan Trancik (co-founded Exponea)
  • Bootstrapped, 40+ employees (London, Prague, Berlin, Bratislava)
  • 1 Top #100 Grossing Mobile Game (US -Hill Climb Racing 2)
  • 3 Top #1 Best-selling PC Games (Tom Clancy’s: The Division, Kingdom Come: Deliverance, Stellaris)

4

Data Science Club 2018

5 of 48

Cellense selected clients

Data Science Club 2018

6 of 48

Approach to personalization

Data Science Club 2018

7 of 48

Types of players in a game

  • On average only 1% of players spend some amount of money in a game
  • 20% of payers bring 80% of revenue
  • Small percentage of players are essential for sustaining and growing the game
  • Games are built around this fact with deep progression systems

How can we improve satisfaction of players from purchasing and increase revenue? => personalization

Data Science Club 2018

8 of 48

Current state of mobile games

Every type of player gets either same price and content in an offer or gets to choose from 3-5 potential offers, there is no consideration of :

��

  • Their payment potential
  • Spending habits
  • Their behavior in a game

Resulting in lower conversion, low satisfaction of players which leads to lower revenue potential.

��

Data Science Club 2018

9 of 48

Understanding players through models

Monetary segmentation�(Modified RFM)

Spending habits�(Agglomerative hierarchical clustering)

Content definition based on behavior in a game�(Random forest)

Understanding payment potential

Understanding spending habits of players

Understanding behavior of players in the game

Data Science Club 2018

10 of 48

Path to personalization - the process

Monetary and behavioral analysis, hypothesis generation

Understanding payers through basic and more complex KPIs and analysis.

Evaluation and continuous improvement

Goal is to learn & continuously improve segments definition, content and pricing.

Monetary segmentation

Segmentation based on several monetary attributes.

Parameter optimization of segments

Optimization of boundaries of segments, pricing and content.

Behavioral segmentation

Segmentation based on several engagement aspects of players.

Maximization of revenue and offer conversion

11 of 48

Case study

Personalized offers in Hill Climb Racing 2

Data Science Club 2018

12 of 48

Hill Climb Racing 2

Fingersoft

Released Q4 2016

iOS & Android

multiplayer racing game

1 billion downloads

2.5 mil daily players

400 terabytes of data

Data Science Club 2018

13 of 48

Data Science Club 2018

14 of 48

Data Science Club 2018

15 of 48

Case study - overview

Goal

  • personalized offers
  • deliver to players relevant content at relevant time for relevant price
  • increase revenue and enjoyment from purchasing an offer

Approach

  • set up LiveOps stack that can deliver offers for players
  • use combination of RFM and behavioral segmentation to create personalized offers
  • AB testing for segmentation methods and parameters
  • minimize cannibalization of future profits
  • execution & evaluation

Data Science Club 2018

16 of 48

Pre-personalization status

Game monetizes through

  • direct gem pack purchases
  • seasonal offers (Halloween, Christmas, Back to school)
  • special offers (offers with skin and good value, constantly rotating through shop)
  • rank-based offers (tied to progression)
  • rotating offers based on rank progression

Data Science Club 2018

17 of 48

Elements of offers to optimize

Amount of resources

Additional value

Offer price

Availability

Type of chests

Data Science Club 2018

18 of 48

We are aiming for

Minimize additional value

Price

Content

Value

Availability

Optimization

Increase conversion

Data Science Club 2018

19 of 48

The biggest challenges

  1. Avoiding cannibalisation of future profit
    • how much additional value players should get?
  2. Reacting to changing players’ behavior during progression
    • what are his needs at the moment?
  3. Generating smaller amount of very effective offers
    • focusing on as much distinct offers as possible
  4. No unique content can be used
    • no skins for vehicles or drivers / any customization

Data Science Club 2018

20 of 48

LiveOps stack

Data Science Club 2018

21 of 48

LiveOps offer stack

Business analytics�(BigQuery + Periscope)

Analytics server

LiveOps server

Game client

Understanding player behavior

(monetization and gameplay data)

Defining target payer segments & offer contents

Exporting segments (segment ID + user ID config in BigQuery)

Downloading segments from BigQuery to analytics server (segment ID + user ID config)

Distributing correct segment ID to users

Downloading segment ID + user ID config from analytics server

Displaying segmented popup offers

Sending analytics data to BigQuery�(popup + IAP data)

Evaluation of results

Manual setup of popup offer definitions

Data Science Club 2018

22 of 48

Feature engineering

Data Science Club 2018

23 of 48

Understanding player behavior

  • Looking at 150+ features, but focusing on selecting the right ones
  • Relationships between some of them are hard to understand

Purchasing behavior

Rank progression

Understanding player behavior

Currency spending

Upgrading preference

Usage of tuning parts

Vehicle usage

Data Science Club 2018

24 of 48

Understanding player behavior - example features

  • Percentage of revenue per type of purchase
  • Percentage of purchases per type of purchases
  • Conversion per type
  • How many legendary cards equipped per vehicle
  • Frequency of changing tuning parts
  • Spending preference
  • Usage of free gems
  • Spending hard currency on events

  • Wide or depth upgrading
  • Type of vehicles upgraded (sports cars, funky cars, motorcycles)
  • Mostly played vehicle
  • Mostly upgraded vehicle
  • Last purchased vehicle
  • Season rank
  • Player’s personal rank
  • Duration to rank up
  • How many rank downs

Rank progression

Vehicle usage

Upgrading preference

Purchasing behavior

Currency spending

Usage of tuning parts

Data Science Club 2018

25 of 48

Feature engineering - examples

  • Combination of features are very powerful
  • Good feature engineering can lead to finding interesting patterns
  • Lot of models success is defined by feature engineering

Tells us aggressivity of player’s purchasing = this can create very interesting new segmentation based on aggressivity of players - can divide whales to aggressive ones and less aggressive ones and augment rotation of offers

Tell us what is the probability of purchasing particular type of offer

Data Science Club 2018

26 of 48

Building targeting models

Data Science Club 2018

27 of 48

Third iteration

Maximizing potential

Generally maximization of revenue can be achieved by modifying:

  1. Value (mainly defined by recency - pilot)
  2. Price (monetary segmentation)
  3. Content (behavioral segmentation + content specification model)

Likelihood of purchasing personalized offers is tied to monetary possibilities of segments:

  • Top and higher spenders are more likely to buy with better content personalization
  • Incentivization and remaining segments are more likely to buy when price is changed

Higher segments should have better conversion by increasing more than for lower segments by changing content in the offer.

27

28 of 48

Building targeting models

Monetary segmentation�(Modified RFM)

Behavioral segmentation�(Agglomerative hierarchical clustering)

Prediction of chest type�(Random forest - feature importance)

Understanding monetary possibilities of players

Understanding players preferences

Additional personalization of chest content

Data Science Club 2018

29 of 48

Monetary segmentation

Data Science Club 2018

30 of 48

RFM (BI method) 1/2

Every payer gets his own monetary, frequency and recency score + additional scores based on features. Scores define a segment player is assign to.

Basic segments boundaries generation = > bucketing

In our case => Boundaries are rule-based, created after deep analytical work.

Example dimensions, bucketing based on distribution:

  • Monetary :
    • Closeness of max purchase price and 90 percentile to the prices created by economy
    • Cumulative revenue based multiplication of price of an offer
  • Frequency of purchases
  • Recency
    • Playtime
    • Purchasing

Source: Customer segmentation using RFM in SAS

Data Science Club 2018

31 of 48

RFM (BI method) 2/2

Advantage comparing to any other segmentation method:

  • Well suited for precise segment modifications - needed mostly for top and high spenders segment
  • Easy to evaluate subsegments in bigger segments - using any classification method

Shifting players from one segment to another can result in increase of conversion but decrease in total revenue.

Result of this segmentation is price of an offer.

Fun fact: There are players who spend $500+ but never spend more than $5 on one purchase

Simplified RFM segmentation

95th Percentile of maximum purchased value

Recency of purchase

Cumulative revenue

Data Science Club 2018

32 of 48

Behavioral segmentation

Data Science Club 2018

33 of 48

Example hypothesis - large portions of segments are underserved

Based on analytics, we see that players who previously prefered one type of purchase are significantly less satisfied with our content structure of personalized offers.

We took 70% threshold for revenue to manually segment these type of players.

33

34 of 48

Understanding spending habits

Because we need to understand boundaries of segments (e.g. what portion of revenue and purchases needs to come from buying gems, special and popup offers to confidently say that player prefers large amount of gems in his offer) we need to use more advanced machine learning methods.

We performed research of clustering methods and chose the most suitable method for our use case =>

Agglomerative hierarchical clustering = finding players who have similar distribution of purchases and grouping them together.

34

Process of finding similar players:

  1. Every players is its own cluster at the start
  2. Iteratively find 2 most similar groups of players based on distance metric e.g. Euclidean or Manhattan
  3. Link subsegments together until all are linked
  4. By cutting dendrogram (picture on the right) horizontally we find final clusters of players

players

35 of 48

Type of linkage

35

Ward’s method

36 of 48

Third iteration - first approach

Extracting knowledge from segmentation

36

Gathered knowledge for high spenders segment:

  • We need at least 4-5 segments to cover whole high spenders segment according to their preference
  • Players who highly prefer gems or special offers have very low conversion to our popup offers
  • Most purchases are coming from no preference segment and players who prefer popups
  • There are players who see value only in gems, others only in more general offers

1

4 5 6

2

8

Legend:

1. Percentage preference in purchases (black - lowest, white - highest)

2. Heatmap - showing intensity of purchases per purchase type

3. Dendrogram - visual representation of clustering

4. Manual segmentation before clustering (blue - gems, yellow - no preference, green - special offers, - purple - popup offers, brown - buy gems payers)

5. Purchased segmented offers in second iteration (red - yes, grey - no)

6. Agglomerative hierarchical clustering final segmentation (see 3. For color meaning)

7. Feature set - percent of revenue and purchases per purchase type (gems, special offers and popup offers)

8. Dendrogram for feature set

3

7

37 of 48

Content specification

Data Science Club 2018

38 of 48

Progression defines needs of players

Players progress through various flows - for example:

Can we find these patterns and learn from our existing data?

Step 1

Vehicle purchase with soft currency

Vehicle upgrading until all soft currency is depleted

Step 2

Step 3

Joining event and seeing what vehicles winners are playing

Step 4

Opening race chests and seeing rare card

Purchase

Player purchases bundle with new vehicle from event, skin and additional vehicle chest

Data Science Club 2018

39 of 48

Content type specification

39

We additionally specify content using the most important attribute using random forest with 100 trees:

Data:

  • Last vehicle purchased
  • Most played vehicle in terms of races (last day, 1 week, month etc)
  • Most upgraded vehicle (last day, 1 week, month etc)
  • Most time played (last day, 1 week, month etc)
  • Maximum upgrades vehicle
  • Sum of resources spend on any vehicle

Goal: Predict legendary chest purchases based on state of player’s in a point of purchasing

Target : 17 classes - 17 types of vehicles in a game

Fun fact: 20% of players buy chest on vehicle that they do not own (possible user interface issue)

Most played vehicle last 250 races

Most upgraded vehicle last 30 days

40 of 48

Result

Data Science Club 2018

41 of 48

Generation of final offer

41

Random forest

RFM

Hierarchical clustering

42 of 48

Evaluation

Data Science Club 2018

43 of 48

Improving through iterations

2 iterations

lots of learnings

and doubling the revenue

+40%

+108%

Data Science Club 2018

44 of 48

Better than seasonal offers

With well targeted special offers we were able to generate 2x revenue during the offer weekend - better than the any other seasonal offer before.

Referee

Back to school

Chopper

4th July

Segmented offers v2

Data Science Club 2018

45 of 48

Next steps

Data Science Club 2018

46 of 48

How about ongoing segmented LiveOps?

  • Player segmentation can be a very effective tool to drive the revenue
  • 2x-10x revenue during LiveOps events

=> Could be as much as 80% of your revenue!

Data Science Club 2018

47 of 48

THANK YOU

robert.magyar@cellense.com

48 of 48

We are hiring!

cellense.com/jobs