1 of 54

Agent-based modelling of the data economy

Lawrence Kay Senior Policy Advisor

lawrence.kay@

theODI.org

Sara Mahmoud Researcher

sara.mahmoud@

theODI.org

Nigel Shardlow

Planning Director

nigel@

sandtable.com

Lies Boelen

Data Scientist

lies@

sandtable.com

2 of 54

Agenda:

Agent-based modelling of the data economy

  1. Project introduction

  • Our modelling process, and an introduction to ABMs

  • Our data economy ABM

  • Playing with the model, and discussion

2

theODI.org

3 of 54

Agent-based modelling of the data economy: what is our aim?

To explore how agent-based models might help policy-makers understand the effects of data policy on the data economy.

Place your image over the grey box and crop accordingly

3

theODI.org

4 of 54

Agent-based modelling of the data economy: what will we produce?

A playable model of the data economy that helps policy-makers to explore some of its features, and guidance on how to develop an ABM as a non-specialist.

Place your image over the grey box and crop accordingly

4

theODI.org

5 of 54

Agent-based modelling of the data economy: what is the data economy?

Data is non-rivalrous and excludable, and innovative products and services are produced by a combination of data and human capital.

Place your image over the grey box and crop accordingly

5

theODI.org

6 of 54

Agent-based modelling of the data economy: what is the nature of policy-making for the data economy?

Policy questions for the data economy are probably in a complex environment, which has features of emergence and self-organization.

Place your image over the grey box and crop accordingly

6

theODI.org

7 of 54

Agent-based modelling of the data economy: which policies could we investigate?

Ethics

Organisational change

Portability

What are the effects of data being used more ethically?

What are the effects of organisations being able to better extract value from data?

What are the effects of portability and on competition and innovation, and what happens if we also develop data trusts?

7

theODI.org

8 of 54

Agent-based modelling of the data economy: why model?

We are building on the Blackett Review

Key themes

  • Shared mental model
  • Sharing and challenge
  • Learning from surprises

Source: Government Office for Science (2018) Computational Modelling: Technological Futures

Place your image over the grey box and crop accordingly

8

theODI.org

9 of 54

Agent-based modelling of the data economy: where are we starting with our model?

Source: Government Office for Science (2018) Computational Modelling: Technological Futures

Model purpose

Essential features

Risks

Prediction

Anticipates unknown data

Conditions of application unclear

Explanation

Uses plausible mechanisms to match outcome data in a well-defined manner

Model is ‘brittle’, so minor changes in the set-up result in a bad fit to explained data

Understanding theory

Systematically maps out or establishes the consequences of some mechanisms

Mistakes in the model specification; inadequate coverage of possibilities

Illustration

Shows an idea clearly

Over-interpretation to make theoretical or empirical claims

Analogy

Maps to what is being modelled in a plausible but flexible way and provides new insights

Confusion between a way of thinking about something and the truth — this model gives no support to empirical claims

9

theODI.org

10 of 54

Your help

We would like your insights, challenges, and questions on our model as it develops towards it final version in March 2019

11 of 54

  • A very short introduction to agent-based modelling
  • Introducing the ODI Data Economy Model
  • A chance to play with the model we have built
  • Your suggestions for how to develop the model

Coming up...

11

12 of 54

Agent-Based Modelling: A Very Short Introduction

13 of 54

Thomas Schelling (1978) Micromotives and Macrobehaviour

14 of 54

Racial map of Baltimore

Image: Erik Fischer. Shared under Creative Commons Attribution-Sharealike 2.0 Generic license.

14

15 of 54

Segregation model

Image: Geraint Ian Palmer. Image shared with the permission of the author.

Each dot represents a family.

Families prefer to have at least x% of their neighbours the same colour as them

If this criterion isn’t met, they move until it is.

15

16 of 54

Segregation model

Each dot represents a family.

Families prefer to have at least 60% of their neighbours the same colour as them

If this criterion isn’t met, they move until it is.

Image: Geraint Ian Palmer. Image shared with the permission of the author.

16

17 of 54

Segregation model

Showing final states for a range of different values of same-colour percentage preference

Strong segregation emerges even at lower preference thresholds

Image: Geraint Ian Palmer. Image shared with the permission of the author.

17

18 of 54

Linking Macro and Micro

Source: Coleman, J (1990) Foundations of Social Theory

Micro-level decisions aggregate up to unexpected macro-level effects and patterns

Macro level phenomena can be explained in terms of the micromotives that are driving them

Increased levels of computer power and data availability make the study of the relationship between micro and macro tractable

18

Coleman’s Boat (or Bathtub)

19 of 54

Model Components (Boxology) - Schelling Model

19

20 of 54

ABM of the data economy

21 of 54

Goal of the model

To answer the question:

“What are the effects of data portability on competition and innovation?”

22 of 54

Boxology

23 of 54

Overview

  • Agent attributes
    • Firms
    • Consumers
  • Behavioural rules
    • Firm attributes
    • Product quality
    • Consumers choosing a firm
    • Consumer attributes
    • Data value
    • Data portability
  • Scenario: a privacy issue
  • App
  • Possible extensions

24 of 54

Agent Attributes

25 of 54

Which attributes does a firm have?

  • Capital - how much capital does a firm have?
    • Of which they can invest a certain amount into product development
  • Data accessibility - what data does a firm have access to?
    • Also impacts product development, but is not being lost in the process
  • Privacy Score - what is the firm’s reputation for privacy?
    • A number between 0 (bad) and 1 (good)
  • Data portability - who does the firm have agreements with?
    • A list of other firms from/to which data is easily portable when users switch
  • Product quality - how good is the firm’s product offering?
    • A positive number (higher = better)

26 of 54

Which attributes does a consumer have?

  • Privacy concern - how much is the consumer worried about firms’ reputations for privacy?
    • A number between 0 (no concern) and 1 (very concerned)
  • Income - how much does the consumer earn?
    • A random number between 0 and 1
    • This is the amount of money that passes between a consumer and a�firm when the consumer uses the firm’s product

27 of 54

Behavioural rules

28 of 54

How can a firm’s attributes change?

  • Capital
    • Gained from consumers who choose the firm’s product
    • A capped amount is lost during product development
  • Data accessibility
    • Data is collected when consumers use the firm’s product - but only data from that time point (tick)
    • If data is portable between two firms, older data can be transferred (more later)
    • Data doesn’t get lost, but does become less valuable over time
  • Privacy score
    • This can be changed directly - we will illustrate this in a scenario later
  • Product quality
    • Product quality is updated at every tick, and is a function of the capital invested, and the available data

29 of 54

Which rules affect consumer attributes?

  • Privacy concern
    • This can be changed directly - we will illustrate this in a scenario later
  • Income
    • Is stable in the current version of the model

30 of 54

31 of 54

Product quality

  • Every firm enters all of their data and a capped amount of capital into product development.
    • The assumption here is that very wealthy firms can only make use of so much of their capital
  • The increase in product quality is a function of Capital and Data value
  • We have used a square root function to prevent larger firms from gaining too much advantage too quickly

x

32 of 54

Value of data

Firms hold historical data from users, but this value depreciates over time.

The intuition here is that newly harvested consumer data is worth more than that harvested in the past.

33 of 54

Value of data (mathematically)

  • The value of data decreases exponentially

time

34 of 54

35 of 54

Example of the consumer utility function

?

?

?

Consumer

(Currently with Firm 1)

(Firm 1 portable with firms 2 and 3)

Consumer most likely to choose firm 2, since it has the highest net utility

36 of 54

How does a consumer choose a firm?

  • A utility function U (firm) is computed as

�����

  • This translates into probabilities of buying each firm’s product���� �������

37 of 54

38 of 54

How is data portability implemented? (1)

  • Assume a consumer switches from firm A to firm B.
  • There is no data portability between the two firms
  • Firm A retains all the old data about the customer
  • Firm B only gets the data starting at the time of the switch

Firm A

Firm B

Current tick

time

39 of 54

How is data portability implemented? (2)

  • Assume a consumer switches from firm A to firm B.
  • There is data portability between the two firms
  • Firm A retains all the old data about the customer
  • Firm B receives the ported data from firm A about the consumer
  • Firm B only gets the data starting at the time of the switch

Firm A

Firm B

Current tick

time

40 of 54

Reminder ...

… data portability gives a boost to the utility function

41 of 54

So what does this look like?

42 of 54

See here for now

Consumers

Firms

43 of 54

Scenario

44 of 54

The privacy scenario

  • At a certain tick, a firm will suffer privacy issues (think: a certain social media website in the aftermath of certain elections). There is a two-pronged effect:
    • The privacy score of the firm involved drops
    • The privacy concern of consumers goes up by a capped amount

45 of 54

Example: World with two big firms

No privacy scenario

Privacy weight (Wpriv) = 4

Privacy scenario

tick 150

Privacy weight (Wpriv) = 4

Privacy scenario

tick 150

Privacy weight (Wpriv) = 6

46 of 54

Example: World with one big firm

No privacy scenario

Privacy weight (Wpriv) = 4

Privacy scenario

tick 150

Privacy weight (Wpriv) = 6

Privacy scenario

tick 150

Privacy weight (Wpriv) = 8

47 of 54

App

48 of 54

Possible future directions

49 of 54

What should go into the next iteration of the model?

  • Should we introduce different types of products?
    • Firms could be active in one or more markets
    • Data collected through different products is additive
  • Should we use a different, discontinuous model of innovation?
    • Instead of the incremental changes we have now, firms could bet money on making the next ‘big’ change
  • Should we introduce the birth and death of firms?
    • Firms that haven’t innovated/have no customers could disappear
    • Small firms popping up, potentially with new types of product
  • Should we let firms buy other firms?
    • Both data and the product would be transferred
    • Could affect the privacy score

50 of 54

(ctd)

  • Should we adjust the value of newly-ported data?
    • Newly acquired data, even if it’s old, has more use than data that has already been used
  • Should we let firms buy and sell data to one another?
    • Apart from the cost in capital, there could be a cost in privacy score for both companies involved
  • Should we let firms buy data from consumers?
    • This assumes that consumers can own data, and that data is somehow portable
  • Should we add more scenarios?
    • E.g. change data portability during the simulation
  • Should we give access to more components in the app? If so, which ones?
    • E.g. let the user change the quality curve, or the initial distribution of privacy scores

51 of 54

Playtime

52 of 54

Group exercises

53 of 54

What’s next?

Get in touch

We would like your insights, challenges, and questions on our model as it develops towards it final version in March 2019. The next workshop will be in late February, or early March, 2019.

lawrence.kay@theodi.org

54 of 54

References

  • Joshua Epstein ‘Why Model?’ (2008), JASSS
  • Peter McBurney ‘What are models for?’ (2011), EUMASS
  • John Little ‘Models and Managers: The Concept of a Decision Calculus’ (1970) Management Science 16(8)
  • Galit Schmueli ‘To Explain or to Predict’ (2010), Statistical Science 25(3)
  • Nigel Shardlow ‘Explanatory and predictive behavioural modelling.’ (2015) chapter in ‘Thinking about behaviour change: an interdisciplinary dialogue.’
  • Schelling, T C (1978) Micromotives and Macrobehaviour
  • For more on Schelling’s connection with Kubrick:
    • (https://www.youtube.com/watch?v=7MlguSQ8emY)
    • https://www3.nd.edu/~dlindley/handouts/strangelovenotes.html

54