1 of 67

Fostering open ecosystems around data

The role of data standards, infrastructure and institutions

November 2021 | @timdavies | tim@practicalparticipation.co.uk

2 of 67

3 of 67

Drawing on

  • Participatory practice�
  • Information Infrastructure Studies�
  • Data Feminism�
  • Indigenous Data Sovereignty �
  • & other critical data practice

4 of 67

5 of 67

6 of 67

Talk outline

  1. Standards 101�
  2. Data ecosystems�
  3. Data infrastructures�
  4. Data Institutions�
  5. Reflections & challenges

7 of 67

Standards 101

8 of 67

12-10-2021

2021-10-12

Oct 12th 2021

der 12. Oktober 21

1633993200

October 12th 2021

Seshhanbeh 1400 Mehr 20

Mon Oct 11 2021 23:00:00 GMT+0000

12/10/2021

9 of 67

12-10-2021

2021-10-12

Oct 12th 2021

der 12. Oktober 21

1633993200

October 12th 2021

Seshhanbeh 1400 Mehr 20

Mon Oct 11 2021 23:00:00 GMT+0000

12/10/2021

US Date: 10th December

European: 12th October

10 of 67

ISO8601

2021-10-12T00:00:00+00:00

11 of 67

ISO8601

2021-09-12T00:00:00+00:00

2021-10-12T00:00:00+00:00

2022-01-01T00:00:00+00:00

2023-03-05

2024-10-12T09:00:00+05:45

12 of 67

ISO8601

ISO = International Organisation for Standards

13 of 67

ISO8601

There are a lot of (data) standards

14 of 67

ISO8601

Standards build on other standards: it’s a system

15 of 67

ISO8601 RFC3339

“...an Internet profile of the ISO 8601 [ISO8601] standard for representation of dates and times using the Gregorian calendar.”

People follow implementations rather than specifications

16 of 67

ISO8601 RFC3339

Standards for interchange not data entry or storage

{ 2021-11-16 }

17 of 67

ISO8601 RFC3339

  • The event took place in early November�
  • 2021-11 fails validation.�
  • To enter data, user adds arbitrary day: 2021-11-01�
  • Data from multiple sources can be analysed (all dates standardised) but the data might mislead: “The 1st of the Month is the best day to run events.”

Standards enable and constrain

18 of 67

19 of 67

20 of 67

Essentials of a data standard schema

  • Fieldnames (or paths)�
  • Definitions�
  • Validation rules

21 of 67

site_code

commodity_name

commodity_code

quantity

unit

22 of 67

  • site_code - (string) a unique identifier for the site�
  • commodity_name - (string) the given name for a primary agricultural product that can be bought and sold. Note: this is used for labelling only, and might be over-ridden by values taken from the commodity_code reference list.�
  • commodity_code - (string; enum) a value from the approved codelist that uniquely identifies the specific commodity.�
  • quantity - (number) the number of unit of the commodity�
  • unit - (string;enum) the unit the quantity is measured in, from the list kg for kilograms or tonne for metric tonne. If quantities were collected in an unlisted unit, they should be converted before representation.

site_code

commodity_name

commodity_code

quantity

unit

23 of 67

  • site_code - (string) a unique identifier for the site�
  • commodity_name - (string) the given name for a primary agricultural product that can be bought and sold. Note: this is used for labelling only, and might be over-ridden by values taken from the commodity_code reference list.�
  • commodity_code - (string; enum) a value from the approved codelist that uniquely identifies the specific commodity.�
  • quantity - (number) the number of unit of the commodity�
  • unit - (string;enum) the unit the quantity is measured in, from the list kg for kilograms or tonne for metric tonne. If quantities were collected in an unlisted unit, they should be converted before representation.

site_code

commodity_name

commodity_code

quantity

unit

24 of 67

Apples =

080801

c_541

25 of 67

For example: The Open Apparel Registry has developed unique production site identifiers by combining different existing datasets

26 of 67

site_code

commodity_name

commodity_code

quantity

unit

27 of 67

Data standards are technical specifications of how to exchange data

28 of 67

Designing standards is a technical task

29 of 67

Designing standards goes beyond the technical tasks

30 of 67

Creator

Intermediary

User

31 of 67

Creator

Intermediary

User

Creator

Creator

User

User

32 of 67

Intermediary

User

User

User

Creator

Creator

Creator

33 of 67

Creator

Intermediary

User

Creator

Creator

User

User

34 of 67

Creator

Intermediary

User

Creator

Creator

User

User

35 of 67

Creator

Intermediary

User

Creator

Creator

User

User

$

36 of 67

Creator

Intermediary

User

37 of 67

Data standards could be a powerful tool for the projects you are working on

38 of 67

Data ecosystems

39 of 67

Standards can support decentralisation and innovation

40 of 67

Approaches open to decentralisation can support greater generativity, freedom and resilience

41 of 67

Let many flowers bloom?

Publish once. Use anywhere.

42 of 67

43 of 67

44 of 67

Support standards over apps

45 of 67

46 of 67

Support build�

Adopt, adapt, engage, shape

47 of 67

48 of 67

Infrastructures

49 of 67

50 of 67

Some components of data infrastructure

  • Schema & documentation�
  • Validation tools�
  • Reference implementations & code�
  • Reference data�
  • Data registries�
  • Aggregators and APIs

51 of 67

52 of 67

See https://www.stateofopendata.od4d.net/ infrastructure chapter for references

53 of 67

Institutions

54 of 67

Standards adoption requires trust�

Ownership, stewardship and institutions matter

55 of 67

56 of 67

57 of 67

58 of 67

Industry (data) standards are set by those who turn up.

59 of 67

Reflections & challenges

60 of 67

61 of 67

What causes data standards to fail?

(1) Underinvestment

62 of 67

What causes data standards to fail?

(2) Stopping short of version 2

63 of 67

What causes data standards to fail?

(3) Tailoring to the dominant use case

and

(4) Trying to meet all use cases

64 of 67

What causes data standards to fail?

(5) Treating standards as a technical problem

and

(6) Neglecting the technical details

65 of 67

12-10-2021

2021-10-12

Oct 12th 2021

der 12. Oktober 21

1633993200

October 12th 2021

Seshhanbeh 1400 Mehr 20

Mon Oct 11 2021 23:00:00 GMT+0000

12/10/2021

US Date: 10th December

European: 12th October

66 of 67

Closing remarks

67 of 67

Thankyou!

E-mail: tim@practicalparticipation.co.uk

Twitter: @timdavies