1 of 12

GailBot:

An automated system for Jeffersonian transcription

Umair, Mertens, Albert & de Ruiter

2 of 12

Current Speech-to-Text

1 *SP2: All right so how's everything going

2 *SP1: It's going okay um this is my only day to sleep late

3 ish this week so that was really nice

4 *SP2: I woke up like twenty minutes ago

5 *SP1: Yeah same

6 *SP2: I had to drive here really quick

7 *SP1: Me too me too I like

8 *SP2: Yeah

3 of 12

Transcription Goal

1 *SP2: All right so how's everything going

2 *SP1: It's going okay um this is my only day to sleep late

3 ish this week so that was really nice

4 *SP2: I woke up like twenty minutes ago

5 *SP1: Yeah same

6 *SP2: I had to drive here really quick

7 *SP1: Me too me too I like

8 *SP2: Yeah

(0.7)

(0.2)

(0.6)

4 of 12

GailBot Architecture

5 of 12

GailBot Architecture

Stage 1

6 of 12

GailBot Architecture

Stage 1

7 of 12

GailBot Architecture

Stage 1

Stage 2

8 of 12

GailBot Architecture

Stage 1

Stage 2

9 of 12

GailBot Architecture

Stage 1

Stage 2

Stage 3

10 of 12

GailBot Architecture

Stage 1

Stage 2

Stage 3

11 of 12

GailBot transcript

*SP1: $=laughs (0.4) All right so:: (.) how's everything going

(1.2)

*SP2: okay (0.5) uhm this is my only day to sleep late ish (.) this week third hour

coming twenty minutes ago (0.3) yeah

*SP1: Hey I drive here really quick

(0.3)

*SP2: I had to drive here really quick

*SP1: Me too (.) and I like

12 of 12

Summary

  • GailBot transcribes some paralinguistic features of talk
  • Works best with
    • dual-stream audio (each speaker has their own channel)
    • high quality audio
  • Speeds up (does not replace) human transcription
    • GailBot transcripts require humans to fix errors

@HI_Lab_Tufts

@HI_Lab_Tufts

@JPdeRuiter

@therealjmertens

Correspondence author:

Muhammad.umair@tufts.edu