1 of 14

Radio Dicer

Experiments in automatically segmenting radio content

James Dooley

BBC News Labs

2 of 14

There is so much content

Most of it is story driven

3 of 14

Chopping up programmes is a pain

3

4 of 14

News consumption is changing

4

5 of 14

Segmentation can help

Visual breakdown of content
Jump to points of interest
Linking to timestamps in programmes
Personalisation

5

6 of 14

So, how do we do it?

Through the magic of text alignment and fuzzy matching

6

7 of 14

What tools do we have at our disposal?

Running Orders

Account of what is in the programme, with prepared scripts

Machine Transcription

Speech to Text processing of the audio programme

7

8 of 14

What tools do we have at our disposal?

Machine Transcription

BBC Kaldi
Trained on BBC content

Running Orders

Vary by programme
Different systems
Hard to extract data

8

At the BBC, running orders vary programme to programme. Some are fully scripted, and the presenter sticks to this script very closely.
Some others are more rough with their scripts, and the presenters ad lib a lot

Also the systems - some are on OpenMedia, but most programmes are on ENPS, which a _very_ old running order system - I think it’s been around for over 15 years now?
So there aren’t any APIs, and it’s hard to extract the data
Some people create their running orders as word documents. So how do we automate that?

--------

So BBC R&D have done a lot of work with the open source Kaldi toolkit
And have trained a model using many hours of subtitled content from the BBC archives
(News Labs have also built an internal product around this model for journalists in the Newsroom, like Trint.)

9 of 14

Inputs

Words Array

Rundown Array

{

start: 0.17,

confidence: 1,

end: 0.39,

word: "good",

punct: "Good",

index: 0

}

{

story: "Headlines",

script: "Good evening this is the Six O'Clock News..."

}

9

10 of 14

10

Here's just an example - The World At One from Radio 4

Again, running order on the left, transcript on the right

This is just a UI to visualise the Radio Dicer output

And you can see Radio Dicer has actually chunked up the programme into different sections (which you can click)

So what the algorithm is doing here, is taking the opening words for each story in the running order

And doing some smart matching in order to find it in the kaldi transcript. Because each word in the transcript has timing information, we can infer at what time these stories are.

There are some discrepancies, like numbers and words, and brexit, and also the running order doesn’t include things like interviews, audio clips, and music not spoken by the presenter. This is one of the big differences why

Which is why for an audience facing use case of this, you’d ideally want to use the presenter script on the left, because it’s been edited by a person. Etc, rather than run into errors like Joanna showed, with 300 million instead of 3.

You can see that sometimes the presenter isn't _perfectly on script_ (we called/we asked), there are _directions_ in the script, and then sometimes the transcription gets things wrong too - like _Jeremy Hunter_!

11 of 14

Results

Tried with different spoken programmes
85%+ perfect match for scripted programmes
When presenters ad lib, accuracy decreases
Non-spoken notes in running order causes errors
Transcription errors

Misunderstanding names
Brexit = “breaks it”, “Breck’s. It”, “breakfast”

11

12 of 14

12

13 of 14

Closing Thoughts

Relatively niche use-case
To be open sourced
Feeding improvements into the model
Short term solution to the problem
OpenMedia and other systems will help

13

14 of 14

Thanks!

Any questions?

You can find me at:

james.dooley@bbc.co.uk | github.com/jamesdools

14