1 of 21

Image Recognition for Archaeological Research

Claudia Engel and

Justine Issavi

SUL AI Studio Experiments

Jan 23, 2019

Many thanks to Chris Chute, Peter Mangiafico, Scott Haddow, Jochen Kumm

2 of 21

Project Aim:

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt

The competition:

  • Lorem ipsum
  • Dolor sit amet

Apply machine learning techniques to enhance the metadata of Çatalhöyük Research Project’s (ÇRP) image repository.

3 of 21

Today ÇRP has accumulated close to 5TB of data, including:

  • Image repository with a total of ~150,000 images.

    • ~49,000 images have inconsistent or very incomplete metadata.

4 of 21

Flinders Petrie behind a camera at excavations in Abydos (1899).

Courtesy © Petrie Museum of Egyptian Archaeology (UCL)

Archaeology is a destructive science.

Photography has played an essential role in recording the excavation process since the very beginnings of the discipline.

5 of 21

Desired output:

Experiments:

To label ~49,000 images that lack valuable metadata using:

A subset of already labeled images in the database

A subset of images labeled manually

A subset of images that were taken with a whiteboard containing information about the object and photograph.

Ultimately, we would like to query images for “burial hole with skeleton” or “bone with stone artifacts,” we also plan to identify particular archaeological objects (e.g. figurines, bucrania, obsidian blades, etc.)

  1. Detect images with whiteboards.
  2. Extract textual information + parse the text.
  3. Annotate the whiteboards to isolate and machine read the handwritten text on whiteboards.

6 of 21

Tagging untagged images

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt

The competition:

  • Lorem ipsum
  • Dolor sit amet

7 of 21

Tagging untagged images

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt

The competition:

  • Lorem ipsum
  • Dolor sit amet

49023

8 of 21

Object recognition

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt

The competition:

  • Lorem ipsum
  • Dolor sit amet

9 of 21

Object recognition

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt

The competition:

  • Lorem ipsum
  • Dolor sit amet

22068

10 of 21

Existing Models

n = 766

11 of 21

Existing Models

“soil”

Google Vision API

12 of 21

Existing Models

“soil”

Clarifai Predict API

13 of 21

Existing Models

“soil”

Google Vision API

Clarifai Predict API

14 of 21

Existing Models

Histo with how many images share labels broken down by number of labels

15 of 21

Existing Models

“predictor agreement”

16 of 21

Accuracy

17 of 21

Next

18 of 21

Thank you

?

https://cengel.github.io/Catal-Vision-API

19 of 21

20 of 21

magic-vision.herokuapp.com�(created by Peter Mangiafico)

21 of 21

Object recognition

Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt

The competition:

  • Lorem ipsum
  • Dolor sit amet