1 of 23

UNDERTAKING THE LARGEST-EVER ASSESSMENT OF THE UK’S WILDLIFE

Tom August, Nick Isaac, Gary Powney & Charlie Outhwaite

2 of 23

Wild bird populations in the UK, 1970-2015 . 2017. DEFRA

UK Biodiversity Indicators 2015. JNCC

WWF. 2016. Living Planet Report 2016. Risk and resilience in a new era. WWF International, Gland, Switzerland

3 of 23

How do we know these things?

  • Long term surveys of animal and plant populations
    • Undertaken by professional scientists
    • Undertaken by volunteer naturalists

Project Sapelli, ZSL & ExCiteS

Bird ringing by Hans Olofsson

4 of 23

Countryside survey:

Surveys from 2007, 1998, 1990, 1984 and 1978.

How do we know these things?

Professional surveys

Carey, P.D.; Wallis, S.; Chamberlain, P.M.; Cooper, A.; Emmett, B.A.; Maskell, L.C.; McCann, T.; Murphy, J.; Norton, L.R.; Reynolds, B.; Scott, W.A.; Simpson, I.C.; Smart, S.M.; Ullyett, J.M.. 2008 Countryside Survey: UK Results from 2007. NERC/Centre for Ecology & Hydrology

5 of 23

UK Ladybird survey receives 20,000 records a year from unpaid members of the public

How do we know these things?

Volunteer surveys

6 of 23

Who is the BRC?�Biological Records Centre

The national focus for terrestrial and freshwater recording

7 of 23

Biases in ad-hoc data

Photo: Richard Comont

Time

Space

Detectability

Effort

8 of 23

Accounting for bias

Isaac et al (2014) Meth Ecol Evol 5: 1052-1060

9 of 23

Occupancy models

  • Account for variation in recording effort and detectability over time
  • Account for bias in space and time

10 of 23

What is the computational issue

Analyses were previously run on our in-house cluster for ~4,000 species.

We parallelised across species

1 CPU

20,000 iterations

3 chains

Input

Output

X 4000

11 of 23

What is the computational issue

Analyses were expanded to 12,000 species some with ~2 week runtimes

1 CPU

20,000 iterations

3 chains

Input

Output

X 12,000

12 of 23

What is the computational issue

Analyses were expanded to 12,000 species some with ~2 week runtimes

1 CPU

20,000 iterations

1 chains

Input

Output

X 36,000

1 CPU

20,000 iterations

1 chains

Input

Output

1 CPU

20,000 iterations

1 chains

Input

Output

13 of 23

What is the computational issue

Analyses were expanded to 12,000 species some with ~2 week runtimes

1 CPU

1,000 its

1 chain

X 720,000

1 CPU

1,000 its

1 chain

1 CPU

1,000 its

1 chain

x20

1 CPU

1,000 its

1 chain

1 CPU

1,000 its

1 chain

1 CPU

1,000 its

1 chain

x20

1 CPU

1,000 its

1 chain

1 CPU

1,000 its

1 chain

1 CPU

1,000 its

1 chain

x20

14 of 23

Jasmin solution

1 CPU

1,000 its

1 chain

X 720,000

1 CPU

1,000 its

1 chain

1 CPU

1,000 its

1 chain

x20

1 CPU

1,000 its

1 chain

1 CPU

1,000 its

1 chain

1 CPU

1,000 its

1 chain

x20

1 CPU

1,000 its

1 chain

1 CPU

1,000 its

1 chain

1 CPU

1,000 its

1 chain

x20

20,000 iterations

3 chains

Input

Output

X 12,000

15 of 23

Challenges?

  • Creating thousands of jobs to run required extra scripting
  • Managing thousands of simultaneous jobs was a headache.
    • Tracking jobs that failed and restarting them required extra code
  • Handling many more output files added to overheads

16 of 23

Rewards

  • Trends for thousands of species including groups which could not have been run in-house (e.g. moths)
  • Access, at times, to over 2000 cores simultaneously
  • A workflow that will allow re-analysis in the future
  • An opportunity to learn, and train others within the organisation
  • An analysis unprecedented in complexity and scope within the UK

17 of 23

Results redacted, sorry!

18 of 23

Next steps

  • The UK Biodiversity indicators will be published this summer
  • What is driving the changes we see?
  • How do trends vary across the UK and between taxonomic and functional groups?
  • How do we best share the results for both scientists and recorders?

19 of 23

Conclusions

  • Wildlife is changing both globally and nationally
  • Citizen science data is a valuable source of information
  • Cutting edge models require large HPC facilities
  • Results give evidence to support policy decisions and report on successes and failures

20 of 23

With thanks to…

Charlie Outhwaite

@CharlieLouO

PhD Student

Gary Powney

@GaryPowney

Quantitative Ecologist

Nick Isaac

@drNickIsaac

Macroecologist

Thousands of volunteer records who contribute there time for free

Recording schemes and societies who support recorders and generously give their data and expertise

The JASMIN team, especially Fatima Chami

21 of 23

22 of 23

Biases in ad-hoc data

23 of 23

Biases in ad-hoc data

Top Grasshopper recorders

Isaac & Pocock (2015) Biol J Linn Soc 115: 522-531

Contribution to citizen science projects often follows the 80:20 rule where approximately 80% of the data comes from only 20% of the recorders