1 of 20

DataCue API

Chris Needham

12 August 2019

2 of 20

Goals

  • Native UA support for DASH emsg events, across all major browser engines, as part of support for MPEG CMAF content
  • API support for other timed metadata cues
    • e.g., ID3 tags in HLS in WebKit
    • What others are Others?
  • API support for application-generated timed metadata cues
    • without requiring string serialization (as with existing VTTCue solution)

Continued...

3 of 20

Goals

  • Add support for on-receive and on-start cue event firing
    • Allow the web application to prepare or fetch data needed for when the cue is triggered on the media timeline
  • Add support for cues with unknown end time�(valid for entire media duration)
  • Improve timing accuracy of timed metadata and timed text cue event firing
    • cf. existing TextTrackCue support and time marches on in HTML

4 of 20

On-receive and on-start event triggering

  • on-start corresponds to text track cue placement on media timeline
  • on-receive allows�fetching of resources�prior to display

5 of 20

DASH-IF player architecture

6 of 20

DASH-IF player architecture (type 3)

7 of 20

DASH-IF player architecture (type 1)

8 of 20

Proposal

  • Use existing HTML TextTrack and related APIs as much as possible
  • Extend TextTrack and related APIs as needed
    • Add support for on-receive and on-start event triggering
  • Design to be feature detectable
  • Use WebKit DataCue and aspects of HbbTV API as needed

9 of 20

HbbTV API (MPD events)

  • HbbTV includes a native DASH player (type 1)
  • See HbbTV 2.0.2 spec (9.3.2 MPEG DASH event integration):
    • A TextTrack shall be provided for each event stream signalled in the MPD as defined in MPEG DASH ISO/IEC 23009-1 section 5.10.2
      • excluding DASH-specific events as defined in MPEG DASH ISO/IEC 23009-1 5.10.4 (e.g., MPD validation expiry events)
      • excluding events streams defined by DVB DASH to be consumed by the terminal

10 of 20

HbbTV API (emsg events)

  • See HbbTV 2.0.2 spec (9.3.2 MPEG DASH event integration):
    • A TextTrack shall be provided for each event stream included in currently selected Representations as defined in clause 5.10.3 of MPEG DASH ISO/IEC 23009-1
      • excluding DASH-specific events as defined in MPEG DASH ISO/IEC 23009-1 5.10.4
      • excluding events streams defined by DVB DASH to be consumed by the terminal
  • addtrack and removetrack events report changes to the set of available event streams (MPD or in-band emsg)
  • Proposal: TextTracks created on application request (opt-in)

11 of 20

TextTrack API changes to support DataCue and in-band events

Note: New or changed parts of the API are in bold text

12 of 20

TextTrack API changes

  • HTMLMediaElement.addTextTrack(kind, config) or HTMLMediaElement.addInBandMetadataTrack(type)
    • Allow web applications to receive in-band cues
  • DataCue.value
    • Allow different JavaScript data types as cue payload: String, Number, Array, ArrayBuffer, Object.
  • DataCue.type
    • Purpose: …

13 of 20

TextTrack API changes

  • TextTrackCue.endTime
    • Allow Infinity, to signal that a cue extends for the entire media duration
  • TextTrack.oncuereceived
    • Allow a web application to act on a cue before it’s point on the media timeline (e.g., to fetch any resources needed)
  • CueEvent
    • Passed to oncuereceived, contains the list of cues that have been received

14 of 20

In-band tracks in HTML

Sourcing in-band text tracks defines support for in-band events for various media formats

Format

inBandMetadataTrackDispatchType

Ogg

Name header field value

WebM

CodecID element value

MPEG-2

PMT stream_type, ES_info descriptor bytes

MPEG-4

mett box (MetaDataSampleEntry) mime_format field or metx box (XMLMetaDataSampleEntry) namespace field

15 of 20

Browser support

Not all browsers have implemented inBandMetadataTrackDispatchType or DataCue

Browser

inBandMetadataTrackDispatchType

DataCue

Firefox

Yes

No

Chrome

No

No

Safari

Yes

Yes

Edge

Yes

Yes (?)

16 of 20

HbbTV TextTrack property / emsg binding

TextTrack property

In-band emsg events

kind

“metadata”

label

(empty string)

language

(empty string)

id

(empty string)

inBandMetadataDispatchType

schemeIdUri + “ “ + value

mode

“hidden”

17 of 20

TextTrackCue property / emsg mapping

TextTrackCue property

emsg data field

DOMString id

unsigned int(32) id

double startTime

Computed from:

unsigned int(32) timescale and�unsigned int(32) presentation_time_delta

double endTime

Computed from:

unsigned int(32) timescale,�unsigned int(32) presentation_time_delta, and

unsigned int(32) event_duration

See TextTrack .inBandMetadata-�DispatchType

string scheme_id_uri

string value

18 of 20

Open questions

  • Should in-band events be exposed as ArrayBuffer or as objects specific to the cue type? See this issue
    • ArrayBuffer requires application code to parse the events, but more easily extensible
    • For commonly used events, the UA could provide the data as an object. Want to avoid having to update the UA for new event types?
    • See blink-dev discussion
  • How to identify in-band tracks? See this issue
    • DataCue.type, TextTrack.id, TextTrack.inBandMetadataDispatchType?

19 of 20

Synchronisation

  • Improve timing accuracy of timed metadata and timed text cue event firing
  • Use cases:
    • Enable TextTrackCue rendering to align with scene / shot boundaries
    • Synchronised rendering of visual content, using DataCue. Examples:
      • Audio stream with poster images should align to the audio
      • Sports game with player annotations overlaid on the video
  • Timing is controlled by time marches on steps in HTML
    • Requirement to run between 15 ms and 250 ms.
    • Interpreted differently by browsers: Chrome runs at ~250 ms, Firefox more frequently

20 of 20

Next steps

  • Update DataCue explainer with latest text from M&E IG Note
  • Add example code to illustrate each feature