1 of 23

TEAM BUILDING

The Paris Bible Challenge

Orientation session with Transkribus

David Joseph Wrisley, NYU Abu Dhabi, @DJWrisley

Estelle Guéville, Yale university @EstelleGvl

RBDH @ UBFC, 16/01/2023

LAD 2013.051

[Add presentation title in slide master mode]

1

2 of 23

OBJECTIVES OF THE PBP CHALLENGE

  • Assessing the quality of computer created transcriptions and adjusting them using human intelligence.
  • Raising awareness about the rapidly evolving domain of AI and the humanities.
  • Contributing to the quantity and quality of ground truth for the Paris Bible Project.
  • Expanding the project's abbreviation transcription guidelines.
  • Discussing the biases of computer vision in paleography.
  • Supporting the creation of new scribal profiles to help locate and date manuscripts belonging to this tradition.
  • Bringing together students, researchers and library professionals, connecting languages and international communities.
  • Providing an opportunity for experiential learning.
  • Modeling an initial public interaction with the PBP research project and setting a baseline for future encounters.
  • Dissemination of correct-a-thon results in posts on the project blog.

[Add presentation title in slide master mode]

2

3 of 23

TODAY'S OUTLINE

  • Learning about handwritten text recognition

  • The Teams and the Collection

  • Learning to Work from the Transkribus Lite interface

[Add presentation title in slide master mode]

3

4 of 23

HTR (Handwritten Text Recognition)

[Add presentation title in slide master mode]

4

5 of 23

HTR / OCR

OCR (or Optical Character Recognition) is a field of research active for many decades in computer science.

OCR is a way of converting an image of text (text represented as pixels) into text as you would type it on the keyboard. OCR'd text can be searched, they can be processed by a computer and they are useful higher level analysis.

HTR (handwritten text recognition) is a specialized form of OCR for handwritten texts used by archivists worldwide in many languages.

To use HTR you need to choose a platform, train the computer to understand the particular handwriting you are interested in (usually a few dozen pages of ground truth-GT are required).

[Add presentation title in slide master mode]

5

6 of 23

Practical applications of OCR

[Add presentation title in slide master mode]

6

7 of 23

Let's check out: https://transkribus.ai/ with some print and handwritten examples

[Add presentation title in slide master mode]

7

8 of 23

Teams and Manuscripts

[Add presentation title in slide master mode]

8

9 of 23

[Add presentation title in slide master mode]

9

10 of 23

Inventory

Country of origin

Date

Italy

13th century

England or France

ca. 1325

Stanford Libraries ms. 23

Paris

mid 1250s

Besançon municipal library ms. 4

France

By or for a franciscan monastery.

mid 13th century

Besançon municipal library ms. 5

Unknown

2nd half of 13th century

Germany

Third quarter of the 13th century

Mainz and Germany Mainz

ca. 1454

[Add presentation title in slide master mode]

10

11 of 23

TEAMS

Team 1

  • ROBERT LLOYD
  • GAURI BHAGWAT
  • ALICE FOURNIER

Team 2

  • AMANDA HEMMONS
  • ALEXANDRE KEYES
  • LUCIA SOL BEZZECCHI PETROFF

Team 3

  • MARIE NOIROT
  • ANNA CHEMISOVA
  • BENEDICTA ARTHUR

Team 4

  • SHARON GUERRA
  • NINA JACOBSON
  • SUMEYYE TOPKARA

Team 5

  • SONAJ KAILAS
  • KATERI SOULARD
  • JESUS DAVID MACCHI FRANCO

Team 6

  • ELIA COULOT
  • SERHAT ACAR

Team 7

  • UNA FALLER
  • DIEGO RODRIGUEZ

[Add presentation title in slide master mode]

11

12 of 23

12

Team 1

Italy

13th c

[Add presentation title in slide master mode]

12

13 of 23

13

Team 2

England or France

ca. 1325

[Add presentation title in slide master mode]

13

14 of 23

14

Team 3

Stanford 23

Paris

mid 1250s

[Add presentation title in slide master mode]

14

15 of 23

15

Team 4

France. By or for a franciscan monastery

mid 13th century

[Add presentation title in slide master mode]

15

16 of 23

16

Team 5

Unknown

2nd half of 13th century

[Add presentation title in slide master mode]

16

17 of 23

17

Team 6

Germany

Third quarter of the 13th century

[Add presentation title in slide master mode]

17

18 of 23

18

Team 7

Mainz and Germany Mainz

ca. 1454

[Add presentation title in slide master mode]

18

19 of 23

Transkribus Lite

[Add presentation title in slide master mode]

19

20 of 23

Transkribus Lite 2.0

[Add presentation title in slide master mode]

20

21 of 23

[Add presentation title in slide master mode]

21

22 of 23

TRANSKRIBUS LITE (Stanford 23)

  • login
  • selection of a collection
  • number of pages / metadata about pages
  • page status (new / in progress)
  • show layout
    • creating lines / blocks
  • view vs text vs layout
  • virtual keyboard (EGE)
  • save
    • in progress
    • done (when the page completed)
    • ground truth (Estelle & David)
    • final (post challenge)
  • correcting text

To know more: Transkribus channel on Youtube

[Add presentation title in slide master mode]

22

23 of 23

parisbible.github.io

Thank you for your attention!

 

Merci de votre attention! 

شكراً لإتبهكم!

[Add presentation title in slide master mode]

23