1 of 48

better use of data

Jerry Fishenden

2016/17

2 of 48

why data matters – some examples

Improving the accuracy and quality of decisions

Improving the relevance, quality and timeliness of services

Providing an open resource for innovation

Data for informing legislation, regulation, policy and decision-making – socio-economic and related data, visualisation, maps, analytics, etc.

Personal, often sensitive, data relating to citizen biographics, circumstances, needs, status

Public data that are open, or available to license, to enable innovation and economic development – e.g. real time transport data, detailed data about public resourcing & expenditure, etc.

3 of 48

first principles

Not all data is equal

Train timetables

Personal medical record

Local Authority annual accounts

(partially)

Some is open and public ….

… and some private, personal and sensitive

At risk children

4 of 48

the challenge

How do you make effective, appropriate, use of data when it’s dispersed in many different places and has different levels of sensitivity and protection?

Organisation 1 data

Organisation 2 data

Organisation 3 data

5 of 48

the reality

Keeping data close to where it’s needed and separate from other data can be both efficient, and good security and privacy practice

Health records

Financial records

Education records

6 of 48

the reality

But data can also be fragmented because of poor organisational and service design, making it difficult to know whether someone is entitled to a service

Financial records

(HMRC)

Health records

(NHS)

Social care records

(DWP and local authorities)

Is Jo eligible for Meals on Wheels?

?

?

?

7 of 48

Jo’s story

Better use of data can help improve Jo’s services without invading Jo’s privacy, or risking fraud and security breaches

I want my services to be delivered to me quickly and easily. But I don’t want lots of people knowing all about my personal affairs.

8 of 48

“data sharing”

a recap

9 of 48

the paper-age legacy

Before computer systems, to ensure that more than one person or team could access data it was duplicated and physically shared with others – what has become known as “data sharing

10 of 48

traditional approaches to data sharing

Hi, tell me what you know about John Smith

John Smith is 43. He lives at 123 The Street. He earned £14,300 last year.

11 of 48

traditional approaches to data sharing

John Smith is 43.

He lives at 123 The Street.

He earned £14,300 last year.

Everyone gets their own copy

12 of 48

traditional approaches to data sharing

John Smith is 43

He lives at 123 The Street

He earned £14,300 last year

Everyone gets together to share what they know

13 of 48

computer data sharing – traditional but digital

John Smith is 43.

He lives at 123 The Street.

He earned £14,300 last year.

Data shared / copied – partial or in full – via e.g. file transfer protocol (FTP) or secure file transfer protocol (SFTP). Or by USB stick, DVD, courier, carrier pigeon etc.

14 of 48

problems with data

15 of 48

The Deputy National Security Advisor, Intelligence Security and Resilience said in evidence:

“…we don’t have sufficiently specific guidance from the Information Commissioner’s Office on what should and should not be reported……There is some exact language…in the terminology used by the Information Commissioner, and you will see it is very broad. The latest guidance states, “If a large number of people are affected or there are very serious consequences, you should inform the ICO.” That is open to interpretation if one delivers services.”

He went on to say:

“…we are going with the Information Commissioner to work for clearer standards so that it is more straightforward for Departments, and frankly to improve conduct.”

The evidence session also revealed that no mandatory training of staff about how to handle data breaches currently takes place across government. There is no mandatory requirement to report a breach if it occurs.

evidence to the PAC hearing on the NAO report

16 of 48

problems with data – nothing new

17 of 48

problems with data

Using paper-age “data sharing” in the age of digital information is part of the reason for massive data breaches at pace and scale

– a better approach is needed

18 of 48

known issues with “data sharing”

John Smith is 44.

He lives at 321 Park Heights.

He earned £22,100 last year.

Oops. When the original data source is updated, copies become out of date.

Data rusts.

19 of 48

consent

Joan Smith is 43.

She lives at 123 The Street.

She earned £14,300 last year.

The citizen gave her data to one organisation for one specific purpose

Her data cannot be used for other purposes without her informed consent

20 of 48

consent “get around” – data sharing gateways

Joan Smith is 43.

She lives at 123 The Street.

She earned £14,300 last year.

The citizen gave her data to one organisation for one specific purpose

But data sharing gateways enable one or more organisations to share data subject to a specific agreement

“... provisions which authorise the use and sharing of information other than for the purpose for which it was originally obtained, although subject to restrictions and conditions…”

… but they can also undermine trust

21 of 48

… but 1-2-1 gateways don’t scale well …

22 of 48

known issues

Data handed to third parties is no longer under the original data owner’s control

John Smith is 43.

He lives at 123 The Street.

He earned £14,300 last year.

Data can be compromised – either through accident or intent – and without the original data owner knowing

23 of 48

known issues

Data handed to third parties – with the best of intent – can compromise the safety of the data subject. This can be a matter of life and death.

Joe is an at risk child. He is at a secure address. The details are …..

Joe

24 of 48

known issues

Personal data is often used for online security and authentication checks such as online banking

Mother’s maiden name

Date of Birth

Place of Birth

Memorable date

Sharing bulk data – civil registration data such as births, marriages, etc. – could compromise existing security assurance processes, automating and facilitating fraud

Parent’s names

Children’s names

Date of Birth

Place of Birth

etc.

25 of 48

first principles

What is “data sharing” trying to achieve? Perhaps we could better describe it as something like ….

“Ensuring timely, access to accurate data in order to make an efficient and well-informed decision with a high quality outcome”

26 of 48

first principles

We need to find better, more secure, means of achieving this outcome – which means better solutions than paper-age “data sharing”

“Ensuring timely, access to accurate data in order to make an efficient and well-informed decision with a high quality outcome”

27 of 48

digital systems can enable better use of data – whilst also ensuring more secure access and protection

We have things such as

Zero knowledge proof

APIs

Encryption

Authentication and authorisation

Attribute / claim confirmation

28 of 48

zero knowledge proof

a method by which one party can prove to another party that a given statement is true, without conveying any information apart from the fact that the statement is indeed true

I am entitled to discounted energy pricing

I am over 21

I am legally resident in the UK

Does not release e.g. full date of birth

Does not release e.g. details of financial circumstances

Does not release e.g. personal details such as passport information

A technique in existence since the early 1980s

29 of 48

APIs ...

… an abbreviation for Application Programming Interface

– an interface that lets one computer talk to another computer

Example: using APIs would enable a citizen to have a single view of their financial interactions with central government by bringing together their data from departments such as DWP and HMRC

APIs

Your account

Modern Web-based APIs have been in use since around 2000

30 of 48

encryption

a cryptographic means of protecting data at rest and in motion so that it can be accessed and used only by authorised people or systems

Data at rest – unencrypted. Anyone with access to the data can read and use it.

Data at rest – encrypted. Only authenticated and authorised users and systems can access the data to read and use it.

Data in motion – unencrypted. Anyone with access to the network or communications method or channel (such as USB stick or network) can access and use the data.

Data in motion – encrypted. No-one without the decryption key can access and use the data.

31 of 48

authentication and authorisation

authentication

a means of ensuring that people or systems are who they claim to be

authorisation

ensuring that an authenticated person or system only accesses the data or processes that they are authorised to access

We have successfully authenticated that this is Joe …

.. and that Joe is authorised to see this data ...

.. but Joe is not authorised to see this data

32 of 48

attribute / claim confirmation

a means of confirming something without unnecessary disclosure or sharing of personal information

Current “data sharing” practice – full records copied from one system or organisation to another

Give me a copy of Joan’s address so I can check where she lives

ORG2

ORG1

Data request

Shared data response

Joan

123 The Street

Anytown

B1 1B

33 of 48

attribute / claim confirmation

a means of confirming something without unnecessary disclosure or sharing of personal data

Joan

Local authority

DVLA

Is Joan a resident of this local authority?

Illustrative only … not necessarily legally compliant

YES / NO

Does Joan own a car registered at this address?

YES / NO

Resident’s parking permit automatically issued or declined

Data driven service

Data is not shared – details are confirmed without disclosure

34 of 48

the Digital Economy Bill

Part 5 - data sharing

35 of 48

DE Bill intent

to make better use of data – accurate, timely data will help improve our public services

36 of 48

DE Bill major proposals

  • to enable Ministers to authorise the “sharing” of a citizen’s personal data with public and private bodies without a citizen’s knowledge or consent – for purposes other than that for which it was originally provided
  • to disclose citizens’ personal details to commercial organisations, such as energy suppliers
  • to “share” bulk data around, including registers of births, marriages and deaths

37 of 48

DE Bill issues

  • Rooted in paper-age notions of “data sharing”
  • Broad-ranging – would breach trust
    • Citizens are obliged to provide data to government in order to receive services. They cannot refuse and go elsewhere – but government will now share that data without consent and for purposes other than that for which it was provided
  • Codes of Practice poor – need to apply good practice computer security / privacy techniques
  • Appears to conflict with the General Data Protection Regulation (GDPR)
  • “Law of unintended consequences”
    • some of the data could compromise security elsewhere: e.g. date of birth, mother’s name etc. bulk shared means some security methods, such as for online and phone authentication for e.g. banking, could be compromised
    • overlooks essential need to protect edge cases – unclear how witness protection, undercover law & enforcement and others will be protected from compromise by data sharing
  • No evidence of any risk modelling of the proposals

38 of 48

DE Bill issues

39 of 48

DE Bill issues – conflicts with GDPR

GDPR requirement

Part 5

1. Data must not be used to monitor the behaviour of people in a way which could be seen as profiling.

Part 5 of the DE Bill wants to share data in order to “flag identified persons” entitled to receive assistance. This appears to be profiling, in conflict with the GDPR. Also, there is no mention of the emphasis that should be given to data minimisation.

2. “Data held by public authorities should only be disclosed when a written, reasoned and occasional request has been made and should not be shared as a filing system in a way that could lead to the interconnection of filing”.

The general purpose of Part 5 of the DE Bill does not appear well aligned to the GDPR. It seems to default to the ability for organisations to share data without consent where the organisation, not the individual, makes the decision alone, and effectively seems to be proposing an approach analogous to a public sector wide file sharing system (the “data sharing” it proposes) – apparently in conflict with the GDPR.

3. Pseudonymised data should be considered identifiable information. Also, Recital 26 states that “The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable.”

Recital 26 raises the question of how well personal data can truly be anonymised – there is a body of research / evidence about the problems associated with truly anonymising data and hence the potential for re-identification. Such difficulties suggest that in practice the GDPR will apply where data can be, or proves to be, re-identifiable even if an organisation intends or believes the data to have been anonymised? De-identified data is not necessarily anonymised data: where are the de-identification regulations / frameworks? If any exist, they do not seem to be referenced in Part 5.

4. People should be aware of the risks, rules, safeguards and rights in relation to the processing of their personal data.

If data is shared beyond the organisation or individual to whom it was originally provided and without their consent or knowledge, it is unclear how citizens will be updated on the additional risks inherent in opening up their data to additional organisations and people.

40 of 48

DE Bill issues – conflicts with GDPR

GDPR requirement

Part 5

5. The exact purpose for the need of the data should be explained at the point the data is requested.

Part 5 appears to cut across the GDPR since it proposes to “data share” or “disclose” data meaning that it is, in such circumstances, no longer being used for the exact purpose for which it was originally requested and provided. “Data sharing” implies uses of the data other than that for which they were originally supplied.

6. Processing should only happen if there is no alternative way.

There already exist other ways that the objectives of making better use of data can be achieved without copying it around more organisations.

7. Data is only lawfully processed if consent has been given by the individual. The GDPR also gives data subjects the right to withdraw consent at any time and “it shall be as easy to withdraw consent as to give it.” Controllers must inform data subjects of the right to withdraw before consent is given. Once consent is withdrawn, data subjects have the right to have their personal data erased and no longer used for processing.

In the context of “data sharing” (undefined), the consent issues becomes problematic: for example, how will consent be withdrawn if data have been widely “shared” and dispersed across multiple organisations over whom the original data controller has no jurisdiction?

8. The data controller should be able to prove that consent has been given (an automatically completed tick box is not considered consent)

Part 5 appears to be proposing a system of data sharing in which consent is not explicitly provided, but is determined by the decisions of “specified persons” rather than the citizen. This appears to place it in direct conflict with the GDPR.

41 of 48

the Bill doesn’t solve significant issues such as …

… this …

… or this

42 of 48

it mainly seems to be a (clumsy) way of solving this ...

Replacing 1-2-1 “data sharing” gateways with …

… widespread “data sharing”

43 of 48

DE Bill Part 5 improvements?

Largely as House of Lords proposed amendments:

  • the public authorities to be granted the powers should be listed on the face of the Bill
  • Ministers should not have power to add any public authority, or description of authority, but only those authorities which they can show, by reference to particular criteria specified in the Bill, have difficulty in recovering debt
  • the power to prescribe a person who provides services to a public authority as a “specified person” should be removed from the Bill (and potentially more focused – with data only being “shared” when vital not simply to improve “wellbeing”)
  • the Codes of Practice will be laid before Parliament and only implemented when agreed by Parliament (recognising the importance of the Codes as legally binding documents)

And also it needs to be brought into full compliance with the GDPR

… but it’s the Codes of Practice that need significant improvements – and to be mandated

44 of 48

better use of data – some potential improvements

  • Map the data landscape – what is where, the quality, the extent of duplication, access controls, etc. – and develop rationalisation / transition plans to create an improved data architecture
  • Give citizens control of their own personal data, and ensure they are an active player in providing consent for making smarter use of data (do it with them, not to them)
  • Ensure good computer security and privacy best practice through the codes of practice in areas such as
    • Zero knowledge proof
    • APIs
    • Encryption
    • Authentication and authorisation [trust framework across private / public sectors required]
    • Attribute / Claim confirmation
  • Ensure protective monitoring, audit, etc., including (as per Estonia) the ability of citizens to see which officials have accessed their data [legal exemptions would apply]
  • Adhere to principle of data minimisation and GDPR
  • Ensure routine, standardised breach reporting
  • Model how this will impact sensitive edge cases – at risk individuals, undercover law enforcement, etc. to ensure the design does not cause unintended data breaches

45 of 48

DE Bill – how it could work (simplified)

GP

Has a valid medical condition? YES / NO

1. Jo authorises disclosure of minimal information for meals on wheels

Local Authority

DWP

Simplified … not necessarily legally compliant

Registered disabled? YES / NO

Is a council resident? YES / NO

2. If all checks are passed, Jo receives meals on wheels

Not “data sharing” – but better use of data with active citizen consent

meals on wheels

46 of 48

biggest issues

DE Bill Part 5 and related Codes of Practice / GDPR

Capabilities (Whitehall, supply chain)

No systematic mapping of data / existing landscape in government

No vision of where headed

No data strategy

No API strategy

Lack of a viable, trusted identity framework

47 of 48

acknowledgements

Icons / images from Freepik. Includes icons by Pixel perfect and madebyOliver

Other icons from the OSA Icon Library

48 of 48

“better use of data”

This work is licensed under the Creative Commons

Attribution-NonCommercial-ShareAlike 4.0 International

© Jerry Fishenden, 2016/17