Automatic Speech Recognition of Radio
in the Clariah Media Suite
Eurovision Song Contest 2021 in Rotterdam. Source: Wikimedia/Sietske
Who’s got time for time-based media?
...print sources still dominate for gauging public debate, reactions to popular media, events, etc.
Eurovision Song Contest 2021 in Rotterdam. Source: Wikimedia/Sietske
“The CLARIAH Media Suite is one of the applications of the Dutch infrastructure for Digital Humanities and Social Sciences developed in the CLARIAH project. It facilitates access to key Dutch media collections with advanced multimedia search and analysis tools.
The Media Suite is an innovative digital research environment, an experimental environment (LAB) , in which we are experimenting with new ways of working with multimedia data collections….The Media Suite is in a constant process of co-development with its users and, in that sense, it is not a “finished” environment.”
https://mediasuite.clariah.nl/documentation/faq/what-is-it
Today
ASR (huh! yeah!): what is it good for?
(actually, all kinds of things:)
ASR (huh! yeah!): what is it good for?
(actually, all kinds of things:)
ASR (huh! yeah!): what is it good for?
(actually, all kinds of things:)
ASR and the media suite: parameters
“We bring tools to the data, because for reasons of copyright or privacy these data can not be brought to the tools by simply downloading them.”
ASR in the media suite: process & principles
See (in Dutch) Roeland Ordelman “Spraakherkenning voor onderzoek in AV-archieven – Twintig jaar ontwikkeling in Nederland” AVA_net, 2021 https://www.avanet.nl/spraakherkenning-voor-onderzoek-in-av-archieven-twintig-jaar-ontwikkeling-in-nederland/
“Speech Recognition” Beeld en Geluid https://archiefstats.beeldengeluid.nl/speech-recognition
ASR in the media suite: Sound and Vision
More complete overview here https://archiefstats.beeldengeluid.nl/speech-recognition/availability
Currently on hold, to resume with after new system build
Radio in the media suite
More complete overview here https://mediasuitedata.clariah.nl/dataset/radio-collection-daan
Radio + ASR in the media suite
Status 22 November 2021. More extensive – but outdated - overview here https://archiefstats.beeldengeluid.nl/speech-recognition/availability
Radio + ASR in the media suite: transcript availability
Status 22 November 2021. More extensive – but outdated - overview here https://archiefstats.beeldengeluid.nl/speech-recognition/availability
Learning features + developing strategies
Using and exploring ASR in the Media Suite
Case study: TV on the Radio
At a loss for scholarly inspiration? Check out: https://escincontext.com/resources/bibliography-of-esc-research/
images from https://eurovision.tv/events
PSA:
If you have not already, please:
1. Getting in, searching in the ASR layer
2.1 Design a query
2.2 Refine your query
2.2.3 Refine your query: Combine with linked data
Please remember to save your refined queries!
3.1 Distant reading: “Eurovisie” (Eurovision)
3.2 Distant reading: comparing
4.0 from distant to close reading: filtering
source: https://eurovision.tv/douze-points
4.1 Close reading:
4.2 Close reading:
4.3 Close reading:
So...
(thank you)