Radio Dicer
Experiments in automatically segmenting radio content
James Dooley
BBC News Labs
There is so much content
Most of it is story driven
Chopping up programmes is a pain
3
News consumption is changing
4
Segmentation can help
5
So, how do we do it?
Through the magic of text alignment and fuzzy matching
6
What tools do we have at our disposal?
Running Orders
Account of what is in the programme, with prepared scripts
Machine Transcription
Speech to Text processing of the audio programme
7
What tools do we have at our disposal?
Machine Transcription
Running Orders
8
Inputs
Words Array
Rundown Array
{
start: 0.17,
confidence: 1,
end: 0.39,
word: "good",
punct: "Good",
index: 0
}
{
story: "Headlines",
script: "Good evening this is the Six O'Clock News..."
}
9
10
Results
11
12
Closing Thoughts
13
Thanks!
14