Machine learning in action
How we doubled revenue on a game
with over a billion players
Robert Magyar
5.11.2018
Hello!
I am Robert Magyar
Data scientist/Analyst @ Cellense
You can find me at robert.magyar@cellense.com
2
Data Science Club 2018
Agenda
Data Science Club 2018
4
Data Science Club 2018
Cellense selected clients
Data Science Club 2018
Approach to personalization
Data Science Club 2018
Types of players in a game
How can we improve satisfaction of players from purchasing and increase revenue? => personalization
Data Science Club 2018
Current state of mobile games
Every type of player gets either same price and content in an offer or gets to choose from 3-5 potential offers, there is no consideration of :
��
Resulting in lower conversion, low satisfaction of players which leads to lower revenue potential.
��
Data Science Club 2018
Understanding players through models
Monetary segmentation�(Modified RFM)
Spending habits�(Agglomerative hierarchical clustering)
Content definition based on behavior in a game�(Random forest)
Understanding payment potential
Understanding spending habits of players
Understanding behavior of players in the game
Data Science Club 2018
Path to personalization - the process
Monetary and behavioral analysis, hypothesis generation
Understanding payers through basic and more complex KPIs and analysis.
Evaluation and continuous improvement
Goal is to learn & continuously improve segments definition, content and pricing.
Monetary segmentation
Segmentation based on several monetary attributes.
Parameter optimization of segments
Optimization of boundaries of segments, pricing and content.
Behavioral segmentation
Segmentation based on several engagement aspects of players.
Maximization of revenue and offer conversion
Case study
Personalized offers in Hill Climb Racing 2
Data Science Club 2018
Hill Climb Racing 2
Fingersoft
Released Q4 2016
iOS & Android
multiplayer racing game
1 billion downloads
2.5 mil daily players
400 terabytes of data
Data Science Club 2018
Data Science Club 2018
Data Science Club 2018
Case study - overview
Goal
Approach
Data Science Club 2018
Pre-personalization status
Game monetizes through
Data Science Club 2018
Elements of offers to optimize
Amount of resources
Additional value
Offer price
Availability
Type of chests
Data Science Club 2018
We are aiming for
Minimize additional value
Price
Content
Value
Availability
Optimization
Increase conversion
Data Science Club 2018
The biggest challenges
Data Science Club 2018
LiveOps stack
Data Science Club 2018
LiveOps offer stack
Business analytics�(BigQuery + Periscope)
Analytics server
LiveOps server
Game client
Understanding player behavior
(monetization and gameplay data)
Defining target payer segments & offer contents
Exporting segments (segment ID + user ID config in BigQuery)
Downloading segments from BigQuery to analytics server (segment ID + user ID config)
Distributing correct segment ID to users
Downloading segment ID + user ID config from analytics server
Displaying segmented popup offers
Sending analytics data to BigQuery�(popup + IAP data)
Evaluation of results
Manual setup of popup offer definitions
Data Science Club 2018
Feature engineering
Data Science Club 2018
Understanding player behavior
Purchasing behavior
Rank progression
Understanding player behavior
Currency spending
Upgrading preference
Usage of tuning parts
Vehicle usage
Data Science Club 2018
Understanding player behavior - example features
Rank progression
Vehicle usage
Upgrading preference
Purchasing behavior
Currency spending
Usage of tuning parts
Data Science Club 2018
Feature engineering - examples
Tells us aggressivity of player’s purchasing = this can create very interesting new segmentation based on aggressivity of players - can divide whales to aggressive ones and less aggressive ones and augment rotation of offers
Tell us what is the probability of purchasing particular type of offer
Data Science Club 2018
Building targeting models
Data Science Club 2018
Third iteration
Maximizing potential
Generally maximization of revenue can be achieved by modifying:
Likelihood of purchasing personalized offers is tied to monetary possibilities of segments:
Higher segments should have better conversion by increasing more than for lower segments by changing content in the offer.
27
Building targeting models
Monetary segmentation�(Modified RFM)
Behavioral segmentation�(Agglomerative hierarchical clustering)
Prediction of chest type�(Random forest - feature importance)
Understanding monetary possibilities of players
Understanding players preferences
Additional personalization of chest content
Data Science Club 2018
Monetary segmentation
Data Science Club 2018
RFM (BI method) 1/2
Every payer gets his own monetary, frequency and recency score + additional scores based on features. Scores define a segment player is assign to.
Basic segments boundaries generation = > bucketing
In our case => Boundaries are rule-based, created after deep analytical work.
Example dimensions, bucketing based on distribution:
Source: Customer segmentation using RFM in SAS
Data Science Club 2018
RFM (BI method) 2/2
Advantage comparing to any other segmentation method:
Shifting players from one segment to another can result in increase of conversion but decrease in total revenue.
Result of this segmentation is price of an offer.
Fun fact: There are players who spend $500+ but never spend more than $5 on one purchase
Simplified RFM segmentation
95th Percentile of maximum purchased value
Recency of purchase
Cumulative revenue
Data Science Club 2018
Behavioral segmentation
Data Science Club 2018
Example hypothesis - large portions of segments are underserved
Based on analytics, we see that players who previously prefered one type of purchase are significantly less satisfied with our content structure of personalized offers.
We took 70% threshold for revenue to manually segment these type of players.
33
Understanding spending habits
Because we need to understand boundaries of segments (e.g. what portion of revenue and purchases needs to come from buying gems, special and popup offers to confidently say that player prefers large amount of gems in his offer) we need to use more advanced machine learning methods.
We performed research of clustering methods and chose the most suitable method for our use case =>
Agglomerative hierarchical clustering = finding players who have similar distribution of purchases and grouping them together.
34
Process of finding similar players:
players
Type of linkage
35
Ward’s method
Third iteration - first approach
Extracting knowledge from segmentation
36
Gathered knowledge for high spenders segment:
1
4 5 6
2
8
Legend:
1. Percentage preference in purchases (black - lowest, white - highest)
2. Heatmap - showing intensity of purchases per purchase type
3. Dendrogram - visual representation of clustering
4. Manual segmentation before clustering (blue - gems, yellow - no preference, green - special offers, - purple - popup offers, brown - buy gems payers)
5. Purchased segmented offers in second iteration (red - yes, grey - no)
6. Agglomerative hierarchical clustering final segmentation (see 3. For color meaning)
7. Feature set - percent of revenue and purchases per purchase type (gems, special offers and popup offers)
8. Dendrogram for feature set
3
7
Content specification
Data Science Club 2018
Progression defines needs of players
Players progress through various flows - for example:
Can we find these patterns and learn from our existing data?
Step 1
Vehicle purchase with soft currency
Vehicle upgrading until all soft currency is depleted
Step 2
Step 3
Joining event and seeing what vehicles winners are playing
Step 4
Opening race chests and seeing rare card
Purchase
Player purchases bundle with new vehicle from event, skin and additional vehicle chest
Data Science Club 2018
Content type specification
39
We additionally specify content using the most important attribute using random forest with 100 trees:
Data:
Goal: Predict legendary chest purchases based on state of player’s in a point of purchasing
Target : 17 classes - 17 types of vehicles in a game
Fun fact: 20% of players buy chest on vehicle that they do not own (possible user interface issue)
Most played vehicle last 250 races
Most upgraded vehicle last 30 days
Result
Data Science Club 2018
Generation of final offer
41
Random forest
RFM
Hierarchical clustering
Evaluation
Data Science Club 2018
Improving through iterations
2 iterations
lots of learnings
and doubling the revenue
+40%
+108%
Data Science Club 2018
Better than seasonal offers
With well targeted special offers we were able to generate 2x revenue during the offer weekend - better than the any other seasonal offer before.
Referee
Back to school
Chopper
4th July
Segmented offers v2
Data Science Club 2018
Next steps
Data Science Club 2018
How about ongoing segmented LiveOps?
=> Could be as much as 80% of your revenue!
Data Science Club 2018
THANK YOU
robert.magyar@cellense.com