1 of 54

Agent-based modelling of the data economy

Lawrence Kay Senior Policy Advisor

lawrence.kay@

theODI.org

Sara Mahmoud Researcher

sara.mahmoud@

theODI.org

Nigel Shardlow

Planning Director

nigel@

sandtable.com

Lies Boelen

Data Scientist

lies@

sandtable.com

2 of 54

Agenda:

Agent-based modelling of the data economy

Project introduction

Our modelling process, and an introduction to ABMs

Our data economy ABM

Playing with the model, and discussion

theODI.org

3 of 54

Agent-based modelling of the data economy: what is our aim?

To explore how agent-based models might help policy-makers understand the effects of data policy on the data economy.

Place your image over the grey box and crop accordingly

theODI.org

4 of 54

Agent-based modelling of the data economy: what will we produce?

A playable model of the data economy that helps policy-makers to explore some of its features, and guidance on how to develop an ABM as a non-specialist.

Place your image over the grey box and crop accordingly

theODI.org

5 of 54

Agent-based modelling of the data economy: what is the data economy?

Data is non-rivalrous and excludable, and innovative products and services are produced by a combination of data and human capital.

Place your image over the grey box and crop accordingly

theODI.org

6 of 54

Agent-based modelling of the data economy: what is the nature of policy-making for the data economy?

Policy questions for the data economy are probably in a complex environment, which has features of emergence and self-organization.

Place your image over the grey box and crop accordingly

theODI.org

7 of 54

Agent-based modelling of the data economy: which policies could we investigate?

Ethics

Organisational change

Portability

What are the effects of data being used more ethically?

What are the effects of organisations being able to better extract value from data?

What are the effects of portability and on competition and innovation, and what happens if we also develop data trusts?

theODI.org

8 of 54

Agent-based modelling of the data economy: why model?

We are building on the Blackett Review

Key themes

Shared mental model
Sharing and challenge
Learning from surprises

Source: Government Office for Science (2018) Computational Modelling: Technological Futures

Place your image over the grey box and crop accordingly

theODI.org

9 of 54

Agent-based modelling of the data economy: where are we starting with our model?

Source: Government Office for Science (2018) Computational Modelling: Technological Futures

Model purpose	Essential features	Risks
Prediction	Anticipates unknown data	Conditions of application unclear
Explanation	Uses plausible mechanisms to match outcome data in a well-defined manner	Model is ‘brittle’, so minor changes in the set-up result in a bad fit to explained data
Understanding theory	Systematically maps out or establishes the consequences of some mechanisms	Mistakes in the model specification; inadequate coverage of possibilities
Illustration	Shows an idea clearly	Over-interpretation to make theoretical or empirical claims
Analogy	Maps to what is being modelled in a plausible but flexible way and provides new insights	Confusion between a way of thinking about something and the truth — this model gives no support to empirical claims

theODI.org

10 of 54

Your help

We would like your insights, challenges, and questions on our model as it develops towards it final version in March 2019

11 of 54

A very short introduction to agent-based modelling
Introducing the ODI Data Economy Model
A chance to play with the model we have built
Your suggestions for how to develop the model

Coming up...

12 of 54

Agent-Based Modelling: A Very Short Introduction

13 of 54

Thomas Schelling (1978) Micromotives and Macrobehaviour

14 of 54

Racial map of Baltimore

Image: Erik Fischer . Shared under Creative Commons Attribution-Sharealike 2.0 Generic license.

15 of 54

Segregation model

Image: Geraint Ian Palmer. Image shared with the permission of the author.

Each dot represents a family.

Families prefer to have at least x% of their neighbours the same colour as them

If this criterion isn’t met, they move until it is.

16 of 54

Segregation model

Each dot represents a family.

Families prefer to have at least 60% of their neighbours the same colour as them

If this criterion isn’t met, they move until it is.

Image: Geraint Ian Palmer. Image shared with the permission of the author.

17 of 54

Segregation model

Showing final states for a range of different values of same-colour percentage preference

Strong segregation emerges even at lower preference thresholds

Image: Geraint Ian Palmer. Image shared with the permission of the author.

18 of 54

Linking Macro and Micro

Source: Coleman, J (1990) Foundations of Social Theory

Micro-level decisions aggregate up to unexpected macro-level effects and patterns

Macro level phenomena can be explained in terms of the micromotives that are driving them

Increased levels of computer power and data availability make the study of the relationship between micro and macro tractable

Coleman’s Boat (or Bathtub)

19 of 54

Model Components (Boxology) - Schelling Model

20 of 54

ABM of the data economy

21 of 54

Goal of the model

To answer the question:

“What are the effects of data portability on competition and innovation?”

23 of 54

Overview

Agent attributes

Firms
Consumers

Behavioural rules

Firm attributes
Product quality
Consumers choosing a firm
Consumer attributes
Data value
Data portability

Scenario: a privacy issue
App
Possible extensions

24 of 54

Agent Attributes

25 of 54

Which attributes does a firm have?

Capital - how much capital does a firm have?

Of which they can invest a certain amount into product development

Data accessibility - what data does a firm have access to?

Also impacts product development, but is not being lost in the process

Privacy Score - what is the firm’s reputation for privacy?

A number between 0 (bad) and 1 (good)

Data portability - who does the firm have agreements with?

A list of other firms from/to which data is easily portable when users switch

Product quality - how good is the firm’s product offering?

A positive number (higher = better)

26 of 54

Which attributes does a consumer have?

Privacy concern - how much is the consumer worried about firms’ reputations for privacy?

A number between 0 (no concern) and 1 (very concerned)

Income - how much does the consumer earn?

A random number between 0 and 1
This is the amount of money that passes between a consumer and a�firm when the consumer uses the firm’s product

27 of 54

Behavioural rules

28 of 54

How can a firm’s attributes change?

Capital

Gained from consumers who choose the firm’s product
A capped amount is lost during product development

Data accessibility

Data is collected when consumers use the firm’s product - but only data from that time point (tick)
If data is portable between two firms, older data can be transferred (more later)
Data doesn’t get lost, but does become less valuable over time

Privacy score

This can be changed directly - we will illustrate this in a scenario later

Product quality

Product quality is updated at every tick, and is a function of the capital invested, and the available data

29 of 54

Which rules affect consumer attributes?

Privacy concern

This can be changed directly - we will illustrate this in a scenario later

Income

Is stable in the current version of the model

31 of 54

Product quality

Every firm enters all of their data and a capped amount of capital into product development.

The assumption here is that very wealthy firms can only make use of so much of their capital

The increase in product quality is a function of Capital and Data value
We have used a square root function to prevent larger firms from gaining too much advantage too quickly

32 of 54

Value of data

Firms hold historical data from users, but this value depreciates over time.

The intuition here is that newly harvested consumer data is worth more than that harvested in the past.

33 of 54

Value of data (mathematically)

The value of data decreases exponentially

time

35 of 54

Example of the consumer utility function

Consumer

(Currently with Firm 1)

(Firm 1 portable with firms 2 and 3)

Consumer most likely to choose firm 2, since it has the highest net utility

36 of 54

How does a consumer choose a firm?

A utility function U (firm) is computed as

��

This translates into probabilities of buying each firm’s product��

38 of 54

How is data portability implemented? (1)

Assume a consumer switches from firm A to firm B.
There is no data portability between the two firms
Firm A retains all the old data about the customer
Firm B only gets the data starting at the time of the switch

Firm A

Firm B

Current tick

time

39 of 54

How is data portability implemented? (2)

Assume a consumer switches from firm A to firm B.
There is data portability between the two firms
Firm A retains all the old data about the customer
Firm B receives the ported data from firm A about the consumer
Firm B only gets the data starting at the time of the switch

Firm A

Firm B

Current tick

time

40 of 54

Reminder ...

… data portability gives a boost to the utility function

41 of 54

So what does this look like?

42 of 54

See here for now

Consumers

Firms

44 of 54

The privacy scenario

At a certain tick, a firm will suffer privacy issues (think: a certain social media website in the aftermath of certain elections). There is a two-pronged effect:

The privacy score of the firm involved drops
The privacy concern of consumers goes up by a capped amount

45 of 54

Example: World with two big firms

No privacy scenario

Privacy weight (Wpriv) = 4

Privacy scenario

tick 150

Privacy weight (Wpriv) = 4

Privacy scenario

tick 150

Privacy weight (Wpriv) = 6

46 of 54

Example: World with one big firm

No privacy scenario

Privacy weight (Wpriv) = 4

Privacy scenario

tick 150

Privacy weight (Wpriv) = 6

Privacy scenario

tick 150

Privacy weight (Wpriv) = 8

48 of 54

Possible future directions

49 of 54

What should go into the next iteration of the model?

Should we introduce different types of products?

Firms could be active in one or more markets
Data collected through different products is additive

Should we use a different, discontinuous model of innovation?

Instead of the incremental changes we have now, firms could bet money on making the next ‘big’ change

Should we introduce the birth and death of firms?

Firms that haven’t innovated/have no customers could disappear
Small firms popping up, potentially with new types of product

Should we let firms buy other firms?

Both data and the product would be transferred
Could affect the privacy score

50 of 54

(ctd)

Should we adjust the value of newly-ported data?

Newly acquired data, even if it’s old, has more use than data that has already been used

Should we let firms buy and sell data to one another?

Apart from the cost in capital, there could be a cost in privacy score for both companies involved

Should we let firms buy data from consumers?

This assumes that consumers can own data, and that data is somehow portable

Should we add more scenarios?

E.g. change data portability during the simulation

Should we give access to more components in the app? If so, which ones?

E.g. let the user change the quality curve, or the initial distribution of privacy scores

52 of 54

Group exercises

53 of 54

What’s next?

Get in touch

We would like your insights, challenges, and questions on our model as it develops towards it final version in March 2019. The next workshop will be in late February, or early March, 2019.

lawrence.kay@theodi.org

54 of 54

References

Joshua Epstein ‘Why Model?’ (2008), JASSS
Peter McBurney ‘What are models for?’ (2011), EUMASS
John Little ‘Models and Managers: The Concept of a Decision Calculus’ (1970) Management Science 16(8)
Galit Schmueli ‘To Explain or to Predict’ (2010), Statistical Science 25(3)
Nigel Shardlow ‘Explanatory and predictive behavioural modelling.’ (2015) chapter in ‘Thinking about behaviour change: an interdisciplinary dialogue.’
Schelling, T C (1978) Micromotives and Macrobehaviour
For more on Schelling’s connection with Kubrick:

(https://www.youtube.com/watch?v=7MlguSQ8emY)
https://www3.nd.edu/~dlindley/handouts/strangelovenotes.html

1 of 54

2 of 54

3 of 54

4 of 54

5 of 54

6 of 54

7 of 54

8 of 54

9 of 54

10 of 54

11 of 54

12 of 54

13 of 54

14 of 54

15 of 54

16 of 54

17 of 54

18 of 54

19 of 54

20 of 54

21 of 54

22 of 54

23 of 54

24 of 54

25 of 54

26 of 54

27 of 54

28 of 54

29 of 54

30 of 54

31 of 54

32 of 54

33 of 54

34 of 54

35 of 54

36 of 54

37 of 54

38 of 54

39 of 54

40 of 54

41 of 54

42 of 54

43 of 54

44 of 54

45 of 54

46 of 54

47 of 54

48 of 54

49 of 54

50 of 54

51 of 54

52 of 54

53 of 54

54 of 54