1 of 20

Whole drawers

Pinned insects

Macro fossils

Insect types

Herbarium sheets

Drawers & sheets

Bound volumes

Surface scans

Multi-light

2 of 20

  • What is digitisation?
  • Should we image everything?
  • Which specimens should we digitise first?
  • What data do we need from the specimens?
  • How much specimen preparation is required?
  • How do we know if the project is a success or not?
  • Who should be involved in each digitisation project?
  • How do we learn from our mistakes for future projects?
  • How do we keep colleagues senior management / directors informed?

Where to start?

3 of 20

A “typical” digitised specimen

Higher Classification

Scientific name: Ornithoptera victoriae regis Rothschild, 1895

Family: Papilionidae

Location

Locality: Bougainville

Country: Solomon Islands

Continent: Oceania

Collection Event

Recorded by: A S Meek

Specimen

Catalogue number: BMNH(E)102551

Preservative: Dry - mounted

Individual count: 1

Sex: Male

Life stage: Adult

Barcode: 013602485

Permanent URL: https://data.nhm.ac.uk/object/407b7063-f942-42f2-a107-885a82f8cc18/1557705600000

4 of 20

Locality: SITE157761 (Saint Helena)

Type: TYPENonType (Non-type)

Specimen ID: 010687385

Storage Location: LOC816449 (Drawer 75)

Taxonomy: TAX1429066 (Quadraceps hopkinski)

Processed and imported into institutional systems �(CMS, public portal)

We would use more but we’ve hit the limits of our software…

5 of 20

1 curatorial unit

1 UID barcode

1 curatorial unit

1 UID barcode

1 herbarium sheet

9 curatorial units

9 UID barcodes

Risks and Issues

  • Defining Curatorial Units

  • What is it, and how can I barcode it?

1 curatorial unit

1 UID barcode

6 of 20

40 curatorial units

40 UID barcodes

1 curatorial unit

1 UID barcode

  • Defining Curatorial Units

  • This isn’t necessarily consistent either between collections or within the same collection.

  • Curators define these by what they consider sensible on an object by object basis.

  • There is a bias in allocating individual UIDs to larger objects, even if they share the same dataset.

  • It can’t simply be a matter of size, as this isn’t an easily quantifiable measure of how curatorial units are allocated.

7 of 20

Existing Example �(jury-rigged)

1D vs 2D Barcode recognition

  • Workflows depend on reliable reading of barcodes from images
  • Multiple barcode types
  • Barcodes have different physical requirements (smallest 6x6mm matrix, readable via handheld scanners)
  • Specimens barcodes must meet conservation standards
  • Comprehensive testing open source & commercial libraries at different scan resolutions

Results (from 2015)

  • Commercial solutions outperform open source
  • Max. read success 94%
  • Idiosyncratic results (different results on different OS)
  • Need to integrate into our software

8 of 20

Microscope slide workflow – whole slide imaging

Setup

Software

Circa 7 seconds per slide (650-1000 slides per person per day), low cost (£2.5k, Canon SLR, 90mm lens, custom lightbox & stand)

Specimen

Barcode

Drawer Location Barcode

Taxon Name Barcode

Image capture

9 of 20

Microscope slide workflow – specimen imaging

Histology slide scanner

High-end setup, c.£125k (Zeiss AxioScan)

Adapted SLR

Low cost setup, c.£2.5k

(SLR, MP65 lens, flashbox & stand)

10 of 20

Pinned insect workflows (types)

Avg. 2 mins 20 seconds per specimen (250 per day), circa 800k specimens done this way

Specimens manually placed, labels separated, barcodes added, photographed, labels repined and specimen returned

11 of 20

Pinned insect workflows (types)

1

2

3

4

5

6

1. original specimen, 2. unpinned, 3. laid out with barcode, 4. photo and initial transcription, 5. original image, 6. crop

12 of 20

Pinned insect workflows (high throughput)

ALICE (Angled Label Image Capture Equipment)

6x DSLR cameras, 6 images, labels recognized, transformed & reconstructed

Avg. 800 specimens per day

13 of 20

ALICE:

Slides:

Standard:

Workflow imaging rates

Herb.:

Slides

Herbarium

Sheets

Pinned

Standard

Pinned

ALICE

more automated system

more manual system

no removal

label removal

optimal coll.

large/delicate

selecting*

14 of 20

NHM Data Portal (data.nhm.ac.uk)

Search, browse & API to records, images, maps, NHM datasets

17.7 billion records in 220k datasets downloaded since 2015

15 of 20

16 of 20

Recent:

  • Digital Object Identifiers (DOIs) for unique searches / downloads
  • Users can access the original data, and/or choose to update data sets – and we can track usage and impact more effectively, including understanding query use 

17 of 20

Coming soon:

  • ORCID integration 
  • Citation tracking and integration
  • Increased specimen linkage to other resources e.g. analysis and 3D
  • Integrated search of specimens and research data

18 of 20

19 of 20

20 of 20

Useful resources

The value of collections:

McGhie, H.A. (2019). Museum collections and biodiversity conservation. Curating Tomorrow, UK.

https://www.curatingtomorrow.co.uk

Recent science from collections and observational data:

GBIF Secretariat. (2019). GBIF Science Review 2019.

https://doi.org/10.15468/QXXG-7K93

Recent publications:

Eversole et al (2019) Introduction of a novel natural history collection: a model for global scientific collaboration and enhancement of biodiversity infrastructure with a focus on developing countries. Biodiversity and Conservation. https://doi.org/10.1007/s10531-019-01765-0

Vicki A. Funk, V. A. (2018) Collections‐based science in the 21st Century. https://doi.org/10.1111/jse.12315

Nelson, G., & Ellis, S. (2018). The Impact of Digitization and Digital Data Mobilization on Biodiversity Research and Outreach. Biodiversity Information Science and Standards, 2, e28470. https://doi.org/10.3897/biss.2.28470

Smith, V. S., & Blagoderov, V. (2012). Bringing collections out of the dark. ZooKeys, 6(209), 1–6. https://doi.org/10.3897/zookeys.209.3699