Policy Briefing Note: National Data Strategy

Policy briefing note

Understanding the social impact of the National Data Strategy

Draft, v. 0.1

Rachel Coldicutt

22 October 2020

hello@careful.industries

CONTENTS

1. Purpose of this note        2

2. A very brief summary        3

3. Overview        5

4. Concerns        9

1 For Civil Society        9

2. For Regulators        17

3. For the Civil Service        18

4. Additionally        19

5. Draft policy recommendations        20

Appx 1: Dependencies and related frameworks        22

Appx 2: Biography        23

1. Purpose of this note

This note attempts to make it easier for non-technology experts to respond to Q3 of the National Data Strategy consultation, which is open for consultation until 2 December.

The National Data Strategy is about decision making. Data drives decisions, and the Strategy sets out a range of measures and proposals for how data about people, things and systems could and should be used in the UK. The way decisions are made in a democracy is of interest to everyone.

Q3 of the consultation asks:

 “please provide any comments about the potential impact the proposals outlined in this consultation may have on individuals with a protected characteristic under the Equality Act 2010?”

People with protected characteristics are often regarded as “edge cases” in big data sets, and consequently penalised because information is missing, incomplete, or unduly reflective of other structural biases. It is also important because - without good governance - algorithmic decisions can deepen and entrench bias and social injustice.

The consultation questions are currently being considered by technology experts, privacy specialists, and data scientists, but the strategy is of interest to everyone who champions the rights of marginalised communities. To answer Q3 of the consultation, it is important to understand the impact of the Strategy in aggregate, which this briefing note attempts to do.

Draft recommendations are given at the beginning and end of this document as starting points for addressing the concerns raised here.

If you would like to develop these recommendations further, contribute to or co-sign a co-ordinated consultation response, please email hello@careful.industries by 15 November 2020. Otherwise, please feel free to use this note as an input in your own response.

Please note that this note does not consider the privacy, security or data adequacy aspects of the proposal.

2. A very brief summary

Data is not just a technology issue. It drives government decisions, which affect us all. And we all deserve to be counted, accurately represented, and treated with dignity.

The National Data Strategy is open for consultation until 2 December 2020, and the Department of Culture, Media and Sport has asked for feedback on how the strategy will affect people with protected characteristics. In UK law, the protected characteristics are age, disability, gender reassignment, marriage and civil partnership, pregnancy and maternity, race, religion or belief, sex, and sexual orientation.

The short answer to this question is that the strategy risks centralising power and entrenching many structural biases. It is a complex document, but its implementation will have broad economic and social implications; as such it deserves a much wider audience than it is likely to get.

This document offers some context on the Strategy in the section Overview; it attempts to summarise and explain these implications for non-technical audiences in the section Concerns; and offers four draft policy recommendations in the final section.

The draft recommendations are:

1. Appoint specialist Data Commissioners to champion minoritised communities

2. Adopt a clear, public framework for government data

3. Prioritise maintenance and repair alongside innovation

4. Recognise Data Ethics in the Government Digital Design and Technology Profession Capability Framework 


3. Overview

Summary 

The National Data Strategy is about decision making. Data drives decisions, and the Strategy sets out a range of measures and proposals for how data about people, things and systems could and should be used in the UK.

As such, it is also about power.  

The National Data Strategy recommendations cut across business, government, the public sector and charities; it covers everything from energy switching to cutting-edge health research, from roads and bridges to working with victims of domestic violence and preventing suicide ideation. There are sections on identifying underground assets and on working with Troubled Families. This is not a technical piece of strategy, but a sociotechnical one, and it will have broad social and economic impacts.

However, the Strategy is a complicated document, and the impacts do not leap easily from the page. It offers no holistic, easy-to-understand vision of what a data-driven Britain will look like, and instead outlines technocratic and administrative methods, and offers previous case studies. There is no definition of what “good” will look like in practical, understandable terms.

One of the most critical omissions is that the Strategy does not sufficiently differentiate between different kinds of data. It does not explain how roads and bridges are different to young people considering suicide, or set out sufficient mechanisms to involve communities in making essential decisions about their own lives. It sets out a world in which what is measured matters, without explaining who will set the targets that are measured against.

This briefing note does not set out to be alarmist, but it does attempt to highlight the social impact of some of the policy points in the Strategy. Much daily good is derived from the use of data, but not all data is the same and it is vital that appropriate safeguards are put in place to prevent democratic evidence gathering and decision-making becoming obscured as just a technical exercise. This is about power, and it deserves scrutiny.

Some Context

Data is neither a perfect nor a neutral resource, yet it powers mundane and extraordinary things for all of us, all of the time. It is an invisible part of everyday life, as various as the world around us and the people in it.

 Whether or not you are digitally included, information about you is probably being used right now to make a decision - everything from traffic calming to credit scores to groundbreaking health research, from ad targeting to sending the right amount of energy to the grid, to the next recommended show on Netflix all benefit from up-to-date and accurate information.

Tackling all aspects of data for the UK in a single strategy is bound to lead to anomalies and gaps. Privacy and security may be the most widely discussed implications of data use, but increasingly data-driven and predictive decisions are shaping the lives that many people lead and the opportunities that are available to us. This is an emerging field in which governance is not yet adequate, and concepts of safety, effectiveness and protecting human dignity jostle with the right of technologists to innovate.

There is a whole industry growing up to discuss and debate the ethical uses of data-driven decisions, and the people with the loudest voices often represent the software companies and consultancy firms that profit from the implementation of these technologies. But the people and communities who are discriminated against by these decisions often only find out after the fact, when they are seeking explanation or repair. This is why Q3 of the National Data Strategy consultation is so important.

The National Audit Office report “Challenges in using data across government” discusses how the quality and availability of data was a factor in both the handling of the Windrush cases and the National Carers Allowance. More recently in the UK, the handling of both Coronavirus test data and death data has shown that the allegedly simple act of counting things does not always add up to the same version of events, while the analysis and scoring of data for A level results highlighted the ways that assumptions and inequalities can be both hidden and further entrenched by an over-reliance on data. Good data is an equality issue; good decisions are the foundation of a functioning democracy.

What is in the National Data Strategy?

The Strategy primarily sets out a range of methods for increasing data sharing in the UK. It describes a regime of loose control and low compliance for business, and a culture of centralised management within the civil service. Although it appears to be a piece of technology strategy, its many granular recommendations have significant social and economic implications, particularly for people in minoritised groups, but these are not spelt out clearly in the paper.

The paper mainly focuses on a variety of methods and mechanisms to enable data sharing across and between government, the public sector, the private sector, and charities. If a theme runs through it, it is that data is great, and more data will make the UK greater.

The stated aim of the paper is “how best to unlock the power of data for the UK” through an “unashamedly pro-tech approach”, but it offers no analysis of the overlapping impacts of so much simultaneous, cross-sector change. There is no cost-benefit analysis, and the economic evidence is tentative. As the Bennnett Institute/ODI paper “The Value of Data” points out, “Income-based valuations [of data] have several limitations.”

But more than that, the National Data Strategy is a means of centralising power. There is much focus throughout on better measurement and better evidence, but no detail of what will be measured, who will define the targets, or who will be the beneficiaries. Data is expressed in the strategy as a tool for measurement and control, not as a resource to learn from and reflect on.

The Strategy also sets out a global ambition “to be a data champion across the world”. There is recognition of the fact that data is a tool of geopolitical influence, and that the values of Europe, Russia, China and Silicon Valley do not always coincide. And while the UK is a little late to the digital party, there is a nod towards our history of global standards setting: “In the global arena, technical standards are increasingly expressions of ethical and societal values, as well as industry best practice” For the purposes of international data diplomacy, the strategy expresses UK values as “openness, transparency and innovation”.

 These ethical and societal values do not form such a central part of the domestic strategy, which is more managerial, and centred around low-friction innovation and data sharing. The “wider societal benefit of data” is frequently mentioned but not defined, beyond “better, cheaper” public services. And although it is not definitively expressed, it seems reasonable to infer that economic growth is seen as the dominant social good:

“we anticipate that in certain circumstances increasing data availability across the wider economy and society has the potential to support greater innovation and drive economic growth. This would ensure that the benefits of data are realised by the maximum number of people in society and further aid scientific research.” (S.6.1)

“Data can help drive economic growth or enable a good public outcome, especially if the value data sits beyond its immediate use” (S.4.1)

It is worth also noting some ambivalence  in the definition of Data Availability in S.6, which says that data “can generate maximal economic and/or societal benefit” - indicating that one may sometimes be at the expense of the other.  

What is the scope?

The National Data Strategy is not simply a technical document: its recommendations are cross-cutting and will have broad social and economic impact. One significant risk is that these broad social and economic changes will be introduced through consultation only with the data community (broadly comprising the technology, privacy and statistical communities), and not with civil society as a whole. Although the consultation recognises the need to protect the rights of people in protected classes, the density of the strategy will limit its reach outside of the specialist technology policy community.

It should also be noted that the remit of the strategy cuts across eight government departments, and several non-ministerial government bodies.  Its contents touches on international relations, the workings of government, civil-service reform, public-service delivery, local authorities, the effectiveness of the charity sector, and the post-Brexit data future, as well as new opportunities and protections for businesses, particularly SMEs . (A fuller list of related organisations and papers referred to is given in the section Dependencies, related bodies, and frameworks.)


4. Concerns  

This section sets out some of the concerning social impacts of the National Data Strategy.

1 For Civil Society

a. “Fairness” and data

b. Representation and bias in government data

c. Multi-agency data sharing

d. Cross-sector data sharing

e. Individual trust and responsibility

2. For Regulators

a. Reduction of legal barriers and compliance

3. For the Civil Service

a. Civil service accountability

b. Transparency

c. Trust in government

4. Additionally

a. The strategy may not be executable


1 For Civil Society  

a. “Fairness” and data

  1. One of the five pillars of the National Data Strategy is to use data to “Create a Fairer Society for All”, but the definition of “fairness” is likely to change with differing political priorities; the current definition is not explicitly stated in the Strategy.

  1. To be operational, data and data-driven systems require clear rules and specifications rather than personal interpretations of complex issues. As Diane Coyle points out in “The tensions between explainable AI and good public policy”, “the aim of many policies is often not made explicit … this is a major problem for algorithms as they need clear goals to function.”

  1. The Strategy indicates that an Integrated Data Platform is being created; this is:

“a digital collaborative environment that will support government in unlocking the potential of linked data, building up data standards, tools and approaches that enable policymakers to draw on the most up-to-date evidence and analysis to support policy development, improving public services and improving people’s lives.”

  1. This technique of delivering “up-to-date evidence and analysis” is not suitable for driving decision making in every policy area. The evidence is at risk of inheriting the flaws inherent in incomplete and biased data sets, and the analysis risks amplifying algorithmic bias. But no red lines are drawn in the strategy, and no limits are given to the range or remit of the Integrated Data Platform. This is concerning.

  1. Moreover, subjective human concepts such as “fairness” need to be explicitly broken down if they are going to be translated into technical products and services; otherwise there is a risk that the worldviews and assumptions of the technologists creating those services (who are unlikely to be demographically representative of wider society[1]) will instead become embedded in the structure of the data, the algorithms, and the insight. For consumer technology this is often inconvenient; for the delivery of public services, this could be a disaster.

  1. Along with missing, biased and unrepresentative data, this can lead to technologies having biased and discriminatory outcomes. One recent example of this is a BBC investigation that has shown the Home Office is twice as likely to reject the passport photographs of women with darker skin than it is of men with light skin.[2] This is likely the confluence of a biased algorithm that is using an unrepresentative data set.

  1. While many frameworks and ethical standards are referred to in the Strategy, there is no detail on how data-driven “fairness” will be tested and held accountable in the long run. Likewise, there is no mention of involving wider civil society or consulting with minoritised groups on the different outcomes that data assemblages and algorithmic expressions will have for different communities and individuals.

  1. As such, it seems extremely optimistic - not to say misguided - to establish a “Fairer Society for All” as a pillar of the National Data Strategy without also establishing a meaningful system of governance and oversight.

b. Representation and bias in government data

  1. There is a commitment in the strategy for data to “hold a mirror up to society” and “drive efforts to create a more inclusive, less biased society”; this is not just difficult to achieve - it may be impossible.

  1. Many instances of public data repositories are backwards-facing entities that show how society has been. They reflect structural inequalities - including sexism, racism, ableism - in both the data they contain and the data they omit.

  1. For instance, data about ethnicity is not routinely collected across all areas of government: ONS data about deaths is not regularly broken down by ethnicity, but police data about arrests for notable offenses is. This makes it harder to target appropriate health and wellbeing interventions to particular ethnic groups, but - as the Joined-up data in government paper shows - it makes it possible to generate predictive data about which children are considered likely to commit a criminal offense. To quote Safiya Umoja Noble’s Algorithms of Oppression: How Search Engines Reinforce Racism it shows “how digital decisions reinforce oppressive social relationships and enact new modes of racial profiling”. This is just one example of how gaps and asymmetries in the data collected by the UK government turn affect people’s lives - often without any opportunity for recourse or redress. Repairing these gaps and omissions may be the work of decades.

  1. The National Data Strategy specifies the importance of technical standards but does not include any reference to standards for nationally representative data sets; there is no information on the possible process, timeline and governance required to “de-bias” data; and there is no mention of external scrutiny, engagement with minoritised communities, or any specific rights or powers for affected communities to seek redress or improve available data.

  1. Moreover, data is simply a material: it has no agency, is not yet self-cleaning and it cannot drive a more inclusive society on its own. People are needed to do that.

  1. This is not just a risk for policy areas that target minoritised groups; it is a risk for everyone. The computing axiom of “Garbage In, Garbage Out” is relevant here: a system like the Integrated Data Platform will learn from the data it uses and stores. As a recent paper by Vinay Udu Prabu and Abeba Birhane shows, flaws and biases in large data sets can generate significant social threats and harms. Racist inputs and outcomes will provide poor quality training data for government decision-making in other contexts.[3]

c. Multi-agency data sharing 

  1. Multi-agency data sharing is an important part of modern government, but no distinction is made in the Strategy between the treatment of different kinds of data; about, for instance, how data sharing about bridges and roads is different to sharing data about people.

  1. Combining multiple sources of information about a person does not result in having a more complete, objective view of that individual. While there is a commitment in the Strategy to improve “Data Foundations” (and so, over time, accrue more accurate data sets), there is no acknowledgement of the inherent gaps and limitations in such information, or that such data will never give a complete view of a person or a situation.

  1. While many different ethical frameworks are mentioned in the Strategy, there is no explicit reference to the deleterious effects that partial and incomplete data can have on decision making, and no recognition that layering incomplete data from multiple sources does not make it more accurate.

  1. Instead there is a focus on removing “real and perceived” barriers for data sharing.

  1. The strategy tends towards incremental managerialism rather than systemic insight or understanding trends.  This involves breaking people and things down into their data points that represent (some of) our constituent parts, so we can be grouped with similar people and things: roughly, oranges with oranges; bad apples with other bad apples. This can lead to service-delivery by decision tree, for instance people who like x might do y and can be prevented by z.

  1. As the A-level results algorithm showed, this kind of grouping and the resulting decisions can limit social mobility and restrict life chances, while also masking wider social and economic factors.

  1. There is much administrative detail in the strategy of the changes that need to happen within the Civil Service to achieve these changes. A number of case studies are given to show where open data and multi-agency data sharing has been a success in the past -- including health research, open transport data, and energy regulation -- and there is a robust case for making it easier for government agencies to co-operate more easily. But, just as with data, not all forms of co-operation are equal.

  1. In May 2018, the then Digital Minister, Margot James, tabled an amendment in the Commons to suspend the controversial MOU between the Home Office, the Department of Health and NHS Digital, that allowed personal addresses to be shared with the immigration services. Speaking in the debate, Dr Sarah Wollaston MP said,

“medical confidentiality … lies at the heart of the trust between clinicians and their patients, and we mess with that at our peril. If people do not have that trust, they are less likely to come forward and seek the care that they need. There were many unintended consequences as a result of th[is] decision”.[4]

  1. The Troubled Families programme is given as the case study in the Strategy to show how multi-agency data sharing combined with targeting based on individual characteristics leads to “better, cheaper” public services.

  1. This programme, which runs in slightly different forms in different local authorities, “bring[s] together relevant data from local public service partners such as attendance, employment, anti-social behaviour and crime”[5] to identify families experiencing multiple difficulties; the families are then classified as “troubled”. It is worth noting that, in the latest  MHCLG progress report, “health partners were particularly reluctant to share information and there is high sensitivity around health data”; this may be an example of a “real and perceived” barrier the Data Strategy is hoping to overcome.

  1. In their paper “Datafied Child Welfare Services”, Joanna Reddon, Lina Dencik and Harry Warne note that targeting services “is based solely on the premise that something could happen, not with why or how”. They note that this kind of predictive targeting ignores the broader social and economic context. Lambert and Crossley’s 2017 review of the Troubled Families programme documents how this focus leads to a reduction in universal welfare provision, part of a process in which social exclusion is seen as a “condition” of the individual or family rather than a “process” happening within society.

  1. The auxiliary paper “Joined up data in government” (which is cited in the Strategy, with the note that its recommendations will be implemented) is a data methods paper. It gives an example of how data from the Ministry of Justice and Department of Education can “increase understanding of the links between childhood characteristics, education outcomes and (re)-offending” to “assist in identifying the population that requires support through early intervention and evaluating these projects to understand whether they are effective”.

  1. This methodology has much in common with predictive policing programs. Kristian Lum and William Isaac’s paper, “To predict and serve” outlines how data analytics were used by the Chicago Police Department to identify approximately 400 people who might commit violent crimes, based in part on previous arrest data. Lum and Isaac found that “it is clear that police records do not measure crime. They measure some complex interaction between criminality, policing strategy, and community–police relations.”[6]

  1. It also has much in common with the London Gangs Matrix, which Amnesty International called, “a racially biased database criminalising a generation of young black men”.

  1. The Integrated Data Platform, mentioned above, will facilitate more of this kind of data linking, potentially at speed. Section 3 of the Strategy says “barriers to accessing data represent a significant limitation on research; these range from legal barriers (real and perceived) through to cultural blockers and risk aversion.” For some projects like the ones outlined above, these barriers provide necessary ethical speedbumps: both the law and the professionalism of public servants provide protections for citizens, and these must be maintained.

  1. Finally in this section, it is worth also noting that the Strategy also encourages charities to move towards predictive service delivery, to “reach the people most in need, at the time they most need it” and “prove to very high levels of certainty the effectiveness of certain interventions”. Improved data capability in the charity sector is by no means a bad thing, but there is no indication where this data would come from or what these very high-levels of proof would look like. Moreover, the implications of this for funding of experimental and innovative programmes could be stark and must be further considered.

d. Cross-sector data sharing

  1. The Strategy defines “data availability” as:

“an environment which facilitates appropriate data access, mobility and re-use both across and between the private, third and public sectors in order to generate maximal economic and/or societal benefit for the UK”

  1. Other responses to the Strategy will cover the role of data institutions and intermediaries and the protections required to make them effective.

  1. From a social-impact and data ethics perspective, one concern is that both the technologies and the new social norms created by new frictionless consumer protocols may be misused by government departments.

  1. The Strategy calls these protocols “Smart Data”, and they will be familiar from uses such as energy switching and Open Banking.

  1. At the Works and Pensions Select Committee on 22 October 2020, the DWP Permanent Secretary asserted his department needed to:

“understand whether we can get better access to data on bank account details [to assess people’s savings accounts] … The challenge is we probably need new legislation to get the bulk transfer of data, to enable us to do this at a reasonable scale”[7]

  1. This is presumably one of the “real and perceived” legal barriers for data sharing that the Strategy looks to overturn.

  1. It is possible that the Open Banking protocol could be the technical basis for this, but it is important to note that Open Banking exists within the paradigm of active consumer consent rather than one of government surveillance. It is also possible that using the Open Banking protocol could offer technical improvements to the way that Universal Credit is paid, but this technical improvement must not come at the cost of either privacy or dignity.

  1. To quote a precept from the movie Jurassic Park, “Just because you can, doesn’t mean you should.”

e. Individual trust and responsibility

  1. There is not space in this note to explore all of the implications the Strategy has for individuals, and it is to be assumed that many of these will be represented in the responses to the consultation by data-rights groups.

  1. However, there is a concerning disconnect between the outcomes described above and how “fairness, transparency and trust” are set out in the Strategy in a “pro-growth data rights regime”. Individuals, it says,

“should be empowered to control how their data is used, and supported to have the necessary skills and confidence to take active decisions around the use of their data, in order to contribute to the wider societal benefit data can offer.”

  1. The low-compliance, low-bureaucracy, cross-government data-sharing environment outlined in the plans for “cheaper, better” public services does not seem to admit of the right for those who depend on public services to “be empowered to control how their data is used”. It would be interesting to know more about how this is expected to work, and whether those of us who depend on public services are expected to waive privacy and dignity in order to do so.

2. For Regulators

a. Reduction of legal barriers and compliance

  1. Reducing the “legal barriers (real and perceived)” for data sharing demands an incredibly clear social vision - perhaps even a renewed social contract - as well as broad public agreement on the moral and ethical uses of data. As can be seen from the complexity of the Online Harms debate, the numerous ethical frameworks mentioned in this strategy and elsewhere, and the lengthy Information Commissioner’s Office investigation into Cambridge Analytica, there is not sufficient broad social agreement on the impacts and the permissible uses of data to forego clear and detailed guidance on certain topics. The proposal to move from “burdensome regulation” to “just in time advice” from regulators, moreover, could be incredibly resource intense for the CMA and ICO (and potentially also Ofcom, depending on the Online Harms Bill), as new regulatory problems are unravelled on the fly rather than ex ante.

3. For the Civil Service

a. Civil service accountability

  1. “Accountability” in government is used in the strategy to refer to productivity and a willingness to defer to centralised, expert leadership; it does not refer to public accountability or any form of external oversight.

  1. “Transparency” is also mentioned many times: it is cited as a useful UK value on the global stage, in relation to algorithms, and as being essential for public trust, without any indications of what mechanisms might be put in place to make it possible.

  1. As part of centralised data leadership, there are also plans to import senior technology leaders from business, and a commitment to challenging risk aversion, legal barriers, and bureaucracy.

  1. It should be noted that the aims for and affordances of public data are not the same as those for data held by businesses -- the stakeholders for public data are the citizens of the country, accountable through democratic process, not a set of shareholders who are briefed quarterly and mostly interested in financial return. The UK is not an Amazon shopping basket. Unlike retail customers, most citizens do not have the option of switching to another provider when we no longer enjoy the service.

  1. While many submissions to the consultation will focus on the benefits of an improved data capability in government, the distinction and quality of public service must not be undermined: working in modern ways that reflect “the Internet era” is not the sole preserve of business; confusing being able to work with data with knowing how to run an online shop shows a rudimentary lack of understanding of data, shops, and public services. The government DDaT capability must not be consumed by the values and attitudes of Silicon Valley entrepreneurs.

b. Transparency

  1. Regrettably, the strategy is presented in a way that takes effort to analyse and understand: there is no print-friendly copy, there are no paragraph numbers for easy citation, and significant policy recommendations have been omitted in the main text, only to be found in e.g. the auxiliary paper Joined up data in government paper. The architects of this plan are not the obvious ones to implement transparency in a complex field.

c. Trust in government

  1. While there has been much successful public-service technology rolled out in the UK this year, this has also been a year of unprecedented, high-profile failures.

  1. The long-delayed and costly NHS Track and Trace app, the A-level algorithm, and the loss of Covid testing data are the most well-known; each has had very tangible real-world consequences - some of which will ripple for decades to come. In the interest of preserving what is left of public trust in government technology, now is not the time to reduce compliance.

  1. Perhaps rather than thinking of moonshots, the government should take inspiration from another bit of space lore, and invest in its staff to deliver this: for some public services, “failure is not an option”.

4. Additionally

a. The strategy may not be executable

  1. For instance, Mission 2 sets out a number of competing promises, including regulation that works locally and globally but is not too burdensome to limit either innovation or wider digitisation yet maintains public trust, while also giving individuals the power and skills to choose which data is shared across public and private sectors. This must also presumably support the “permissive attitude to data sharing” outlined elsewhere.

5. Draft policy recommendations

These recommendations are a very first draft of some measures to address the concerns raised above; they are open for feedback and discussion during the consultation period, and I hope they might form a starting point for some shared recommendations from civil society organisations.

1. Appoint specialist Data Commissioners to champion minoritised communities

There needs to be much more governance ambition in the Strategy. While many, many organisations and frameworks are mentioned, the job of building a fairer society must begin in and with communities. Appointing specialist Data Commissioners with useful powers will build trust with disenfranchised groups while also delivering critical improvements to the government data landscape. Commissioners could work to spot gaps in data provision, commission repair work, set acceptable use policies, and operate as part of an early warning system to flag ethical and social conflicts in their early stages, or before they arise.

2. Adopt a clear, public framework for government data

The strategy refers to many different types and classes of government and public-service data, but it is not clear which kinds of data can and should be used in which ways. Adopting a clear public framework - the ODI Data Spectrum might be a good starting point - would allow different data uses to be mapped to different kinds of data; this would improve clarity for practitioners, contractors, and the public.

3. Prioritise maintenance and repair alongside innovation

The Strategy says that “data can be leveraged to deliver new and innovative services”, but a programme of data repair (including de-biasing data sets and attending to legacy systems) must be undertaken in parallel. This will enable more even process, with fewer dramatic set-backs, and will help repair the damage done to public trust caused by the “high watermark” of 2020’s very public setbacks.

4. Recognise Data Ethics in the Government Digital Design and Technology Profession Capability Framework 

Many ethical codes and frameworks are listed in the Strategy, but they are too granular and distributed to be useful and there is no clear path for escalation, deliberation and resolution of the issues that are raised. To be worthy of public trust, a data-driven government must adhere to ethical processes, as well principles, and commit to do so transparently, honestly, and in a timely manner. Recognising this a data ethics capability in the DDaT framework would be a first step towards that.


Appx 1: Dependencies and related frameworks

The list of government owners for actions includes the following government departments:

  • DDCMS
  • Cabinet Office
  • ONS
  • MHCLG
  • FCDO
  • DfE
  • BEIS
  • HO
  • DEFRA

as well as the ONS and UKRI.

Other related organisations mentioned in the paper include the Centre for Data Ethics, Better Regulation Executive, the Data Standards Authority, Government Data Hub, the National Statistician’s Data Ethics Advisory Committee, the Information Commissioner’s Office, and the Geospatial Commission.

Ofcom is not explicitly mentioned but is referred to as the regulator of choice in the cited and endorsed CDEI “Online Targeting” paper.

Related and dependent documents include:

Other oversight and advisory bodies and methods mentioned include, for government and public bodies:

  • Public Service Data Science Capability Audit
  • Parallel controls for APIs
  • Technology code of practice

For the use of “Smart Data” in business, there will also be a “Smart Data cross-sector working group”, bringing together representatives the communications, finance, energy and pensions sectors.

Expert and civil society organisations including the Open Data Institute, Alan Turing Institute, Ada Lovelace Institute and the Oxford Internet Institute are also mentioned.

Appx 2: Biography

Rachel Coldicutt is an expert on the social impact of new and emerging technologies.  She was previously CEO of responsible technology think tank Doteveryone, where she led a ground-breaking programme researching how technology is changing society, and developed practical tools for responsible innovation. Prior to that, Rachel spent almost 20 years working at the cutting edge of new technology for companies including the BBC, Microsoft, BT, and Channel 4, and was a pioneer in the digital art world. Rachel is an influential voice on the UK technology scene, and acts as an advisor, board member and trustee to a number of technology-focussed organisations and charities.


[1] The British Computing Society’s analysis of the 2020 ONS Employment Data shows that the  IT professionals (a significant group of technology workers) is dominantly comprised of  younger (<50), white, non-disabled men, far exceeding their representation in the UK working population. BCS DIVERSITY REPORT 2020: ONS ANALYSIS (24 June 2020).

[2]The procedure was based on the Gender Shades study by Joy Buolamwini and Timnit Gebru. http://gendershades.org/

[3] V. U. Prahu and A. Birhane, “Large Datasets: A Pyrrhic Win for Computer Vision”, arXiv preprint arXiv:2006.16923, 2020 - arxiv.org.

[4] Hansard (09 May 2018), vol. 640, col. 835

[5] National Data Strategy

[6] K. Lum and W. Isaac, “To predict and serve?”, https://rss.onlinelibrary.wiley.com/doi/full/10.1111/j.1740-9713.2016.00960.x

[7] The transcript of this session should be at https://committees.parliament.uk/event/1975/formal-meeting-oral-evidence-session/ but the link is broken at the time of writing.