1 of 103

W3C WebRTC WG

TPAC Meeting

September 24, 2024

09:00 - 12:30

Chairs: Bernard Aboba

Harald Alvestrand

Jan-Ivar Bruaroey

2 of 103

W3C WG IPR Policy

This group abides by the W3C Patent Policy�https://www.w3.org/Consortium/Patent-Policy/
Only people and companies listed at https://www.w3.org/2004/01/pp-impl/47318/status are allowed to make substantive contributions to the WebRTC specs

3 of 103

W3C Code of Conduct

This meeting operates under W3C Code of Ethics and Professional Conduct�
We're all passionate about improving WebRTC and the Web, but let's all keep the conversations cordial and professional

4 of 103

Safety Reminders

While attending TPAC, follow the health rules:

Masks and daily testing are left to individual choice

Please be aware of and respect the personal boundaries of your fellow participants

For information about mask and test availability, what to do if you are not feeling well, and what to do if you test positive for COVID, see:

https://www.w3.org/2024/09/TPAC/health.html

5 of 103

About Today’s Meeting

TPAC 2024 Schedule
Today’s Meeting info:

https://www.w3.org/2011/04/webrtc/wiki/September_24_2024

Link to slides has been published on WG wiki
Scribe? IRC http://irc.w3.org/ Channel: #webrtc
Will we be recording the session?
Volunteers for note taking?

6 of 103

Other TPAC 2024 Meetings of Interest

Tuesday, September 24, 2024

16:30 -18:00 WEBRTC WG/SCCG Joint Meeting

Wednesday, September 25, 2024

Breakout Sessions
Realtime Web Track:

RtpTransport (10:00 - 11:00)
Evolved Video Encoding w/WebCodecs (11:15 - 12:15)
Sync on Web (13:15 - 14:15)

Thursday, September 26, 2024

14:00 - 16:00 WEBRTC WG/MEDIA WG Joint Meeting

7 of 103

For Discussion Today

09:10 - 09:30 State of the WG (Harald)
09:30 - 09:50 WebRTC-PC Recycling to REC (Dom)
09:50 - 10:10 Codec Issues (Henrik Bostrom & Harald)
10:10 - 10:30 IceTransport Extensions (Sameer)
10:30 - 11:00 Break
11:00 - 11:20 Encoded Source (Guido Urdaneta)
11:20 - 11:40 Timing Model (Bernard, Markus & Youenn)
11:40 - 12:00 RtpTransport (Peter Thatcher)
12:00 - 12:20 Corruption Stats + Encoder Complexity (Erik Språng)
12:20 - 12:30 Wrap-up and Next Steps (Chairs)

Time control:

A warning will be given 2 minutes before time is up.
Once time has elapsed we will move on to the next item.

8 of 103

Virtual Meeting Tips (Zoom)

Both local and remote participants need to be on irc.w3.org channel #webrtc.
Use “+q” in irc to get into the speaker queue and “-q” to get out of the speaker queue.
To try out WebCodecs over RTCDatachannel (not RTP!) join using a Browser.
Please use headphones when speaking to avoid echo.
Please wait for microphone access to be granted before speaking.
Please state your full name before speaking.

9 of 103

State of the WG (Harald)

Start Time: 09:10 AM

End Time: 09:30 AM

10 of 103

External Environment

WebRTC over RTP is the dominant browser VC platform (and decent chunks of non-browser)
WebRTC over RTP is being used in many niche applications (ex WHIP for recording)
Explorations of other protocols (MOQ, WebCodecs over WebTransport) ongoing, have achieved some deployment traction (Zoom)
Interoperability is mostly “proprietary app on multiple browsers”

11 of 103

Activity since TPAC 2023

Repo activity

Mediacapture-main

Getting ready for REC (still) - removing non-implemented features

Mediacapture-extensions

Holding pen for new ideas

Webrtc-pc

Merging some things from -extensions (when implemented)

Rule for merge is “1 implementation and 1 promise”

Webrtc-extensions

Holding pen for new ideas

Webrtc-stats

Living Standard-like. Simplification and removal of Old Stuff

Webrtc-nv-use-cases

Restructure and attempt to make useful

Webrtc-Encoded-Transform

Clone and metadata -> new product release!

12 of 103

Implementation activity

Lots of features implemented in multiple browsers

Typically >1700 of 2093 tests in webrtc pass

A year ago, approx 1000 of 1200 passed

Allowed moving stuff from -extensions to main

Lots of work to extend test coverage

A strict policy of “all changes must have tests” helps

13 of 103

Things that seem stable (and used)

Mediacapture-transform

But see timing discussion

Mediacapture-record

Minor tweaking recently (initial timestamp)

Mediacapture-fromelement
Mediacapture-image
Webtc-priority

(first SCTP implementation waiting for a customer)

Mst-content-hint
WebRTC-SVC

14 of 103

Major new or expanded topics

Webrtc-encoded-transform

New functionality desired

Webrtc-ice

New direction on how to control pursued

Platform processing for effects and faces

Being pursued in some sync with Media WG

Screen capture

Largely pursued in SCCG community group

15 of 103

Discussion (End Time: 09:30)

16 of 103

WebRTC-PC Recycling to REC (Dom)

Start Time: 09:30 AM

End Time: 09:50 AM

17 of 103

Updating WebRTC Recommendation

Initial WebRTC Recommendation released in January 2021
Since then, 47+ substantive amendments (correction & addition) have been identified and approved
We published 26 of them as candidate amendments in March 2023
Among these 47, 28 have demonstrated implementation and interoperability
PROPOSAL: Bring these 28 amendments to Last Call for Proposed Amendments so they get integrated into a normative republished Recommendation

Starts a 60 days review period
Starts an Advisory Committee review
Establishes a Patent Review Draft

18 of 103

Amendments aligning with existing implementations

Set default values of the RTCConfiguration dictionary, aligning it with current implementations - section 4.2.1 RTCConfiguration Dictionary (PR #2691)
Update RTCIceGatheringState, RTCPeerConnectionState, RTCIceConnectionState to clarify the relevant transport it represents - section 4.3.2 RTCIceGatheringState Enum (PR #2680)
No longer queue a task in the determine DTMF algorithm - section 7.3 canInsertDTMF algorithm (PR #2742)
Align MTI stats with implementations - section 8.6 Mandatory To Implement Stats (PR #2744, PR #2748)
Clarify simulcast envelope is determined by negotiation - section 5.4.1 Simulcast functionality (PR #2760)
Update explanation of simulcast envelope. - section 5.4.1 Simulcast functionality (PR #2814)
Fix ambiguities in the setCodecPreferences() algorithm - section Methods (PR #2847)
Reject setParameters(), replaceTrack(), & insertDTMF() after stop() - section Methods (PR #2829)
Make removeTrack() a no-op after transceiver.stop() - section Methods (PR #2875)
Don't fire connectionstatechange on pc.close() - section Update the connection state (PR #2876)
Fix binaryType setter requirements - section Attributes (PR #2909)
Change the default value of binaryType - section Attributes (PR #2913)

19 of 103

Untestable amendments

Unobservable IDL changes:

Replace DOMTimeStamp in the definition of the RTCCertificateExpiration.expires and of RTCCertificate.expires, and change its origin to certificate creation time - section 4.9.1 RTCCertificateExpiration Dictionary (PR #2686, PR #2700)
Remove unused RTCRtpDecodingParameters dictionary - section 5.2.5 RTCRtpDecodingParameters Dictionary (PR #2753)
Remove single-value RTCIceCredentialType enum - section 4.2.2 RTCIceCredentialType Enum (PR #2767)
Create RTCRtpCodec dictionary and reuse in RTCRtpCodecCapability and RTCRtpCodecParameters definitions - section 5.2.9 RTCRtpCodecParameters Dictionary (PR #2834)
Make RTCRtpHeaderExtensionCapability.uri required - section 5.2.12 RTCRtpHeaderExtensionCapability Dictionary (PR #2841)
Add empty setParameterOptions as second argument to setParameters for extensibility - section RTCSetParameterOptions Dictionary (PR #2885)

Untestable in WPT:

Allow an implementation-defined limit to the number of configured ICE Servers - section 4.2.1 RTCConfiguration Dictionary (PR #2679)
Allow encoder resolution alignment in scaleResolutionDownBy. - section 5. RTP Media API (PR #2808)

20 of 103

Interoperable modifications

Ensure the connecting state happens whenever a ICE or DTLS transport is new - section 4.3.3 RTCPeerConnectionState Enum (PR #2687)
Validate ICE transport settings upfront when setting a configuration - section 4.4.1.6 Set the configuration (PR #2689)
Put ICE transport connection in failed state when no candidates are received - section 5.6.4 RTCIceTransportState Enum (PR #2704)
Add RTCRtpEncodingParameters.maxFramerate - section Methods (PR #2785)
Remove RTCRtpEncodingParameters.scaleResolutionDownBy for audio - section Methods (PR #2772, PR #2799)
Default RTCRtpEncodingParameters.scaleResolutionDownBy to 1 for video - section Methods (PR #2772)
setCodecPreferences only takes into account receive codecs - section Methods (PR #2926)
Add control for the receiver's jitter buffer - section RTCRtpReceiver Interface (PR #2953)

21 of 103

Next steps

Consensus of WG to publish updated draft
AC Review + wide review
If successful, republish Recommendations with remaining 17 candidate corrections (and new ones)

(this can be done every 6 months if useful)

22 of 103

Discussion (End Time: 09:50)

23 of 103

Codec Issues (Henrik Boström & Harald)

Start Time: 09:50 AM

End Time: 10:10 AM

24 of 103

Making Codecs Transceiver-specific (Harald)

Desired functionality from Encoded Transform: Add custom codecs to SDP negotiation
Encoded Transform change discussed in WG meetings from 2023 to April of this year
Resulting requirement on webrtc-pc: Codecs need to be specified per transceiver, not per PeerConnection
Attempted webrtc-pc change was rolled back because it was incomplete
Still working on figuring out details.

25 of 103

Transceiver Specific Codecs (Harald)

SDP is used to negotiate payload types
Payload types are transport-level IDs
Payload types can collide for send vs receive (but usually we try to avoid that)
SDP on sendrecv is “what we can receive”
But where do we specify what we can send?

26 of 103

Codec directionality (Henrik)

Codecs on sendrecv m= section are “what we can receive”*

*fine print: While this is what we prefer to receive (not send),�this may still influence what we send! Important for backwards compatibility.

JSEP 5.3.1: If preferences are not set, answer MUST use offer’s order.�(This is also RECOMMENDED in RFC 3264 section 6.1)

“Offer to send” is a valid use case! ✅

JSEP 5.2.1: If preferences are set, offer MUST exclude codecs ∉ preferences.

If send-only codecs are not preferred, they’re gone! 😱

Conclusion: Preferences can include both send and receive codecs.

Q: How to deal with unidirectional codecs?

27 of 103

Codec directionality (H..

RFC 3264 section 5.1 to the rescue…? TL;DR:

SendOnly stream SHOULD indicate codecs used for sending.
RecvOnly stream SHOULD indicate codecs used for receiving.
SendRecv stream SHOULD indicate codecs offerer is willing to�“send and receive with”.

Unicast: codec filter on direction is straightforward.

SendRecv: can we mix sendrecv with recvonly codecs?

E.g. {H265, H264} where H265 is recvonly and H264 is sendrecv.
Allowing this seems useful.
Risk: if answerer can sendrecv H265, it may remove H264 from offer.

O/A completes but offerer is unable to send anything.

28 of 103

Codec directionality (Henrik)

We should: filter codec preferences based on direction.

Proposal A: Avoid footgun.

Only allow sendrecv codecs on sendrecv stream.
Encode+decode is always possible, but {H265, H264} is not allowed. ❌

Proposal B: Support “{H265, H264}”.

Allow recvonly codec on sendrecv stream. Exclude sendonly.
Decode is always possible.
Encode is not always possible (e.g. answerer removes H264), but that’s OK.
Answerer can add its own recv-only codecs in the answer even if it was not offered. ✅

Proposal C: Max power, max footgun.

Allow both sendonly and recvonly. Decode errors possible! App must be smart. 🔥

Proposal D: Change the rules.

Relax JSEP 5.2.1 to allow codecs in a=rtpmap you don’t want to receive (not in m=)
Non-receive codecs are “I want to know if you can receive this”

29 of 103

Discussion (End Time: 10:10)

30 of 103

IceTransport Extensions (Sameer)

Start Time: 10:10 AM

End Time: 10:30 AM

31 of 103

Issue 209 - App control over ICE checks

Allow an App to observe, influence, and initiate outgoing ICE checks

But first, lots of questions…

How do ICE checks work today?
Why does it need to change? How would it help the user?
Is it possible with existing API and stats?
Could this be configuration rather than an API?

And then…

What does the API look like?
What can and cannot the App control or influence?
What does the API usage look like?
How could the API be extended in the future?

32 of 103

How do ICE checks work today?

ICE connectivity checks sent at the beginning of an ICE session

ICE agent cycles through the discovered candidate pairs
Paced ~50ms apart
Retransmitted on timeout with exponential backoff
Triggered checks to converge faster on working pairs
A selected candidate pair is nominated

Keepalives sent every 15 seconds when no RTP/RTCP packets sent

Consent renewal sent every 4-6 seconds

Consent expires after 30 seconds without renewal
Makes keepalives redundant

33 of 103

How could things be improved?

It is now possible to retain alternate candidate pairs and switch to a different candidate pair during an ICE session
Keepalives / consent renewals will continue on inactive pairs

Example scenario:

Start off selecting a candidate pair, retain one or more alternate pairs

All candidate pairs share a bandwidth-constrained network link
…or an alternate pair uses a power-sensitive network device

App wants to reduce ICE checks on the alternate pairs

Renew consent every 20 seconds instead of every 5 seconds

Active candidate pair degrades, App wants to find a good alternative

Send ICE checks on all alternate pairs to find lowest RTT

34 of 103

Possible with existing API or stats?

Currently not possible for an App to influence ICE checks

Only possible to control where ICE checks may be sent through the remote candidates or ICE servers made available to the ICE agent

RTCIceCandidatePairStats contain some data about ICE checks

Latest and total round trip time since the beginning of session
Number of ICE check requests sent and received
Number of ICE check responses sent and received
Number of consent requests sent
Possible to calculate average RTT
Possible to calculate RTT over a period by calling getStats() frequently
Limited data, not suitable for making immediate decisions

35 of 103

Possible to express with config?

Too many variations and parameters

How often to ping active pair
How often to ping alternate pairs
When and how to reduce checks on some pairs
Are all alternate pairs treated the same
In what order to check pairs
When and how to increase checks on some pairs
How many outstanding checks to keep

Better to provide a focused API surface and let App come up with its own algorithm, optimized to its use case

36 of 103

What is the new API?

partial interface RTCIceTransport {

// Send an ICE check.

Promise<RTCIceCheckRequest> checkCandidatePair(RTCIceCandidatePair pair);

// Fired before ICE agent sends an ICE check, cancellable.

attribute EventHandler /* RTCIceCheckEvent */ onicecandidatepaircheck;

}

interface RTCIceCheckEvent : Event { // Cancellable

readonly attribute RTCIceCandidatePair candidatePair;

// Resolves when the check is actually sent. Rejected => send failure.

readonly attribute Promise<RTCIceCheckRequest> request;

}

interface RTCIceCheckRequest {

readonly attribute ArrayBuffer transactionId;

readonly attribute DOMHighResTimeStamp sentTime;

// Resolves when response is received. Rejected => timeout.

readonly attribute Promise<RTCIceCheckResponse> response;

}

interface RTCIceCheckResponse {

readonly attribute DOMHighResTimeStamp receivedTime;

readonly attribute boolean retransmitted;

// No error => success.

readonly attribute RTCIceCheckResponseError? error;

}

Event fired before ICE check is sent, can be canceled
Conversely, App can initiate an ICE check
A promise resolves when an ICE check is sent
Another promise resolves when a response is received or the check times out

37 of 103

Mechanics of the API

Only STUN binding requests are notified to the App

Not other STUN methods eg. Allocate, Refresh, Send or Data indications
Not STUN binding indications - these don't generate a response

Consent renewals and triggered checks can also be prevented by the App

Nothing special in the content of the STUN binding request

Retransmits are not notified to the App

RTCIceCheckResponse indicates if retransmission occurred
Response should not be used to compute RTT, only to gauge reachability

Outgoing STUN binding responses cannot be prevented by the App
App can only send STUN binding requests, not other STUN methods

Always a new STUN transaction, rate-limited
Possible to have multiple outstanding checks for a candidate pair

38 of 103

How to use the new API?

const pc = …;

const ice = pc.getTransceivers()[0].sender.transport.iceTransport;

ice.onicecandidatepaircheck = async(event) => {

if (shouldNotCheck(event.candidatePair)) {

event.preventDefault(); // prevent a check

return;

}

const request = await event.request;

handleCheck(request);

}

// send a check

const request = await ice.checkCandidatePair(alternatePair);

handleCheck(request);

function handleCheck(request) {

try {

const response = await request.response;

if (response.error) {

// … do something with error …

return;

}

const rtt = response.receivedTime - request.sentTime;

if (!response.retransmitted) {

// … do something with rtt …

}

} catch(error) {

// … do something with timeout …

}

39 of 103

Future extensibility

Read STUN attributes in binding request and response

partial interface RTCIceCheckRequest { partial interface RTCIceCheckResponse {

readonly attribute RTCStunAttributes attributes; readonly attribute RTCStunAttributes attributes;

} }

dictionary RTCStunAttributes { dictionary RTCUnrecognizedStunAttribute {

boolean useCandidate; required ArrayBuffer type;

sequence<ArrayBuffer> unknownAttributes; ArrayBuffer value;

sequence<RTCUnrecognizedStunAttribute> unrecognizedAttributes; }

}

Set STUN attributes in an outgoing binding request

Promise<RTCIceCheckRequest> checkCandidatePair(RTCIceCandidatePair pair, optional RTCStunAttributes attributes);

Set timeout, max retransmission on an outgoing binding request

Promise<RTCIceCheckRequest> checkCandidatePair(RTCIceCandidatePair pair, optional double timeout, optional unsigned short maxRetransmissions);

40 of 103

Discussion (End Time: 10:30)

41 of 103

Break

Start Time: 10:30 AM

End Time: 11:00 AM

42 of 103

Encoded Source (Guido Urdaneta)

Start Time: 11:00 AM

End Time: 11:20 AM

43 of 103

Use Case

3.2.2 Ultra Low latency Broadcast with Fanout

44 of 103

Scenario

A few nodes read from a reliable server, forward to a P2P network with thousands of nodes
Communication with server is expensive
Communication between nodes is cheap
Nodes are generally unreliable (can join/leave at any time)
P2P network topology with redundant paths for reliability (Fan-in and fan-out)

Server

45 of 103

RTCRtpEncodedSource proposal

Input PC1 and PC2 provide the same media
Output masks failures from either Input PC1 or PC2, not timeouts
There might be more than 2 input or output peers

Input PC1

Input PC2

Output PC1

P2P Node

Use Encoded Transform to read frames from Input PCs

Custom processing to produce output frames (discard duplicates, adjust metadata, append app-specific metadata,...)

Write to Output PCs using

Encoded Source

Peer O1

From

Peer I1

From

Peer I2

Output PC1

Peer O2

46 of 103

Original proposal

A few months ago we made a proposal patterned after RTCRtpEncodedSource

Similar to a single-side encoded "transform"

Updated proposal:

Keep the same model
Incorporate feedback from the WG and developers

47 of 103

Feedback from original proposal

Encoded Source allows more freedom than Encoded Transform
Easier to make mistakes
Less connected to internal control loops
Requires more signals (bandwidth, error handling)

48 of 103

Basic example

// main.js

const worker = new Worker('worker.js');

// Let relayPCs be the set of PCs used to relay frames.

for (let pc of relayPCs) {

const [sender] = pc.getSenders();

await sender.createEncodedSource(worker); // similar to replaceTrack()

}

// Let recvPc1, recvPc2 be the receiving PCs.

recvPc1.ontrack = ({receiver}) =>

receiver.transform = new RTCRtpScriptTransform(worker);

recvPc2.ontrack = ({receiver}) =>

receiver.transform = new RTCRtpScriptTransform(worker);

49 of 103

Basic example

// worker.js

let sourceWriters = [];

onrtcsenderencodedsource = ({source: {writable}}) => {

sourceWriters.push(writable.getWriter());

}

onrtctransform = async ({transformer: {readable, writable, options}}) => {

function transform(frame, controller) {

if (shouldForward(frame)) { // app-defined (e.g., drop duplicates)

for (let writer of sourceWriters)

writer.write(getOutputFrame(frame)); // app-defined (e.g., adjust metadata)

}

controller.enqueue(frame); // put original frame back in receiver

}

await readable.pipeThrough(new TransformStream({transform})).pipeTo(writable);

}

50 of 103

Errors and signals - developer feedback

Keyframe requests
Bandwidth information
Synchronous errors for incorrect frames

Timestamps in the past
Decreasing frame ID
Handled by failing write

Other signals

Frames dropped before sending (possibly due to lack of bandwidth)
Expected queue time before sending (in pacer)

51 of 103

Signals - Key Frame requests

Event for Keyframe requests

Similar to the keyframerequest event in Encoded Transform

No need for a generateKeyFrame method (present in Encoded Transform)

The application would explicitly write the key frame to the source's writable

[Exposed=DedicatedWorker] interface RTCRtpSenderEncodedSource {

// Accepts RTCRtpEncoded{Video|Audio}Frame, rejects on incorrect frames

readonly attribute WritableStream writable;

attribute EventHandler onkeyframerequest;

};

52 of 103

Bandwidth signals

Based on Harald's proposal for congestion control
Have a field with bandwidth information:

Bitrate recommended for the media from this source
Available outgoing bitrate for the ICE candidate supporting the transport

Already exposed via stats

Have an event fire every time there is a significant change in bandwidth information

53 of 103

Bandwidth and other signals

[Exposed=DedicatedWorker] interface BandwidthInfo {

readonly attribute long allocatedBitrate; // bits per second

readonly attribute long availableOutgoingBitrate;

};

[Exposed=DedicatedWorker] interface RTCRtpSenderEncodedSource {

...

readonly attribute BandwidthInfo bandwidthInfo;

attribute EventHandler onbandwidthestimate;

readonly attribute unsigned long long droppedFrames;

readonly attribute double expectedSendQueueTime; // milliseconds

};

54 of 103

Example

// worker.js

async function maybeRelayFrame(frame, writer, bandwidthInfo) {

// Append extra redundancy data to the payload if there is enough bandwidth

if (bandwidthInfo.allocatedBitrate > kMinBitrateForExtraRedundancyData) {

appendExtraData(frame);

}

await writer.write(frame);

}

55 of 103

API Shape

partial interface RTCRtpSender {

Promise<undefined> createEncodedSource(

Worker worker, optional any options, optional sequence<object> transfer);

}

partial interface DedicatedWorkerGlobalScope {

attribute EventHandler onrtcsenderencodedsource;

}

[Exposed=DedicatedWorker] interface RTCRtpSenderEncodedSourceEvent : Event {

readonly attribute RTCRtpSenderEncodedSource encodedSource;

};

56 of 103

API Shape

interface BandwidthInfo {

readonly attribute long allocatedBitrate; // bits per second

readonly attribute long availableOutgoingBitrate;

};

[Exposed=DedicatedWorker] interface RTCRtpSenderEncodedSource {

readonly attribute WritableStream writable; // Accepts RTCRtpEncoded{Video|Audio}Frame

attribute EventHandler onkeyframerequest;

readonly attribute BandwidthInfo bandwidthInfo;

attribute EventHandler onbandwidthestimate;

readonly attribute any options;

};

57 of 103

Pros and cons

Similar pattern as encoded transform.

Proven in production
Easy to use and understand

Good match for SFU-like operations that are frame centric:

Zero-timeout, glitch-free forwarding of frames from redundant paths
Drop/adjust frames in response to bandwidth issues
No re-encoding

Requires waiting for a full frame, which introduces extra latency compared with a packet-based API
Future: RTCRtpReceiverEncodedSource?

Fan-in for the receiver

58 of 103

Discussion (End Time: 11:20)

59 of 103

Timing Model I (Bernard, Markus & Youenn)

Start Time: 11:20 AM

End Time: 11:40 AM

60 of 103

For Discussion Today

Mediacapture-transform

Issue 87: What is the timestamp value of the VideoFrame/AudioData from a remote track? (Bernard/Markus)
Issue 96: What is the impact of timestamp for video frames enqueued in VideoTrackGenerator? (Youennf)
Issue 80: Expectations/Requirements for VideoFrame and AudioData timestamps (Bernard/Markus)
Issue 86: Playback and sync of tracks created by VideoTrackGenerator (Bernard/Markus)

61 of 103

Issue 87: What is the timestamp value of the VideoFrame/AudioData from a remote track? (Bernard/Markus)

VideoFrame and AudioData Metadata both include a timestamp attribute.

WebCodecs defines these attributes as a “presentation timestamp”

encoder/decoder behavior not specified for AudioData

RVFC says captureTimestamp is not present for remote tracks

Sample Code sheds doubt on these statements

Encode a VideoFrame and serialize it with timestamp over the wire.
On the receiver, deserialize and decode the EncodedVideoChunk with timestamp.
Use VideotrackGenerator to create a MST from the stream of VideoFrames.
Call RVFC. Observation:

captureTimestamp is set.
How? Suggests that timestamp is in fact a capture timestamp!
In the WebRTC context, how does this relate to rtpTimestamp?

Does the definition of timestamp need to be changed?

How does it differ from RVFC captureTime or rtpTimestamp?

62 of 103

VideoFrame MetaData

63 of 103

AudioData MetaData

64 of 103

VideoFrameCallbackMetadata (RVFC)

dictionary VideoFrameCallbackMetadata {

required DOMHighResTimeStamp presentationTime;

required DOMHighResTimeStamp expectedDisplayTime;

required unsigned long width;

required unsigned long height;

required double mediaTime;

required unsigned long presentedFrames;

double processingDuration;

DOMHighResTimeStamp captureTime;

DOMHighResTimeStamp receiveTime;

unsigned long rtpTimestamp;

};

65 of 103

Proposed model

VideoTrackGenerator does not use nor modify timestamps

In particular, VideoTrackGenerator does not buffer video frames

Each track’s source defines how VideoFrame objects are created

Including computation of the VideoFrame’s timestamp

Each track’s sink defines how to use VideoFrame’s timestamps

VideoTrackGenerator spec clarification

Define Send clone to track as sending immediately clone to each of the track’s sink

Probably in mediacapture-main spec

Issue 96: What is the impact of timestamp for video frames enqueued in VideoTrackGenerator? (Youennf)

66 of 103

Video track’s sources

Capture track: timestamp = capture time, consistent with RVFC
WebRTC track: timestamp, receiveTime & rtpTimestamp, consistent with RVFC
Canvas track: timestamp = time at wich canvas snapshot is made

Video track’s sinks

HTMLMediaElement, RTCRtpSender, MediaRecorder
By default, timestamp is not used

The time at which the frame is submitted is used instead.

Issue 96: What is the impact of timestamp for video frames enqueued in VideoTrackGenerator? (Youennf)

67 of 103

Issue 80: Expectations/Requirements for VideoFrame and AudioData timestamps (Bernard)

Filed by Chris Cunningham (editor of WebCodecs!) on February 9, 2022.

“Is it valid to append multiple VideoFrames or AudioData object with the same timestamp (e.g. timestamp=0) to a MediaStreamTrack? If so what is the behavior? Does the spec describe this?”

If VideoTrackGenerator does not use or modify timestamps:

Does it pass VideoFrames with duplicate timestamp values on to the MediaStreamTrack?
What happens after that (e.g. in HTMLVideoElement)?

If timestamp is not used, do all the submitted VideoFrames render?

68 of 103

Neither Mediacapture-transform nor Mediacapture-main discuss playout points.
If Audio and Video MSTs are combined in a MediaStream, what is supposed to happen?

Media Capture & Streams, Section 4.1:

Issue 86: Playback and sync of tracks created by VideoTrackGenerator (Bernard/Markus)

69 of 103

How are the playout points determined?

Is timestamp used (e.g. as the capture time)?

Are all MSTs in a MediaStream assumed to use the same clock?
Is the RVFC attribute receiveTime used?

Can be used to calculate sender/receiver offset, Jitter, audio/video playout points.

Playout points adjusted to achieve sync.

Proposal:

PR (to mediacapture-?) indicating the role of timestamp and receiveTime in “lipsyc”.

Issue 86: Playback and sync of tracks created by VideoTrackGenerator (Cont’d)

70 of 103

Discussion (End Time: 11:40)

71 of 103

RtpTransport (Peter Thatcher)

Start Time: 11:40 AM

End Time: 12:00 PM

72 of 103

Reminder about use cases

Control/customization of:

Payloads and Header Extensions (Codecs, Data, Metadata)
Packetization (WebCodecs, WASM)
Reliability (FEC, NACK/RTX)
Jitter Buffer
Congestion Control (bandwidth estimation, pacing, probing)
Rate Control (bitrate allocation, encoder rates)
Control/Feedback messages

73 of 103

Example Use Cases

Send arbitrary data within the same congestion control as audio/video
Do processing (face tracking) on video and attach metadata
WebCodecs-based support for:

AAC over RTP
Control of hardware/software failover
Per-frame QP rate control
Long-term references (LTRs)
Spatial scalability with Layer Refresh (LRR)

Implement an audio codec using WASM
Implement FEC designed for high-loss scenarios
Implement a jitter buffer more suitable to streaming applications
Forward RTP packets from one PeerConnection to another
Implement a congestion control algorithm using L4S signals
Implement RTCP messages like LRR, RPSI, SLI, RTCP-XR

74 of 103

Status Update

A while ago

Agreement in the Working Group to add "piecemeal" (incremental) low-level RTP/RTCP functionality

Progress made since then

A repository: https://github.com/w3c/webrtc-rtptransport

A prototype in Chromium

75 of 103

Things That Have Been Figured Out

Transferability/Workers

RtpTransport requires DedicatedWorker
RtpSendStream/RtpReceiveStream are Transferable

Processing lots of packets (optional batch processing)

Events have no payloads
You read out the event payloads with pattern of "readFoo(long maxNumber)"

Custom BWE/Pacing/Probing

JS told what RTP packets would have been sent
JS can send packets with a particular send time
JS told what RTP packets have been sent and what RTCP feedback has been received
JS told if a packet is dropped because of "overuse" (according to a lenient congestion controller)

BYOB

76 of 103

Things In Progress

Custom NACK/RTX (Explainer coming)
Simulcast (per-MID vs per-RID objects in the API)
Going "SDP-less" (Explainer coming)

77 of 103

Conclusion

We're making good progress
Still much work to do
There's a breakout session tomorrow! �(10-11am in California A)

78 of 103

Discussion (End Time: 12:00)

79 of 103

Corruption Stats + Encoder Complexity (Erik Språng)

Start Time: 12:00 PM

End Time: 12:20 PM

80 of 103

Stats issue 787: Corruption Likelihood Metric

The purpose is to provide a metric that indicates the estimated probability that an inbound video stream is experiencing corruptions.

We’re targeting outright bugs that cause visual artifacts that is otherwise not visible in any existing stats, it is not intended as general quality metric. The typical use case is finding problems early and be able to root cause it to e.g. browser versions, hardware setups, configuration/experiment rollouts - without having to rely on user feedback reports.

double totalCorruptionProbability;

double totalSquaredCorruptionProbability;

unsigned long long corruptionMeasurements;

81 of 103

Stats issue 787: Corruption Likelihood Metric

Example implementation: http://www.webrtc.org/experiments/rtp-hdrext/corruption-detection

RTP header extension used as side-channel, transmitting randomly selected image samples for validation

82 of 103

Stats issue 787: Corruption Likelihood Metric

Packet

Extension

With the encoded images, piggy-back a few (13 with one-byte extensions) randomly selected samples as raw values.

83 of 103

Stats issue 787: Corruption Likelihood Metric

Packet

Extension

Compare to the raw decoded samples values on the receive side.

84 of 103

Stats issue 787: Corruption Likelihood Metric

Large QP - average over large area

Low QP - average over small area

The extension has fields to indicate filter size and allowed error thresholds, so that expected distortions from lossy compression can be suppressed.

85 of 103

Stats issue 787: Corruption Likelihood Metric

The stats value is not intended to be tightly coupled to this implementation.

Described as probability of corruption in range [0.0, 1.0] to be generic enough to be used with other implementations.

Future iterations could include:

Other forms of more precise side-channel information.
Receive-side only implementations

Using e.g. natural image statistics or ML models to detect issues

86 of 103

Stats issue 787: Corruption Likelihood Metric

Relevant links:

87 of 103

Issue 191: Add API to control encode complexity

Add encodeComplexityMode to RTCRtpEncodingParameters:

enum RTCEncodeComplexityMode {

"low",

"normal",

"high"

};

partial dictionary RTCRtpEncodingParameters {

RTCEncodeComplexityMode encodeComplexityMode = "normal";

};

Specifies the encoding complexity mode relative to "normal" mode:

"low" mode results in lower device resource usage and worse compression efficiency
"high" mode results in higher device resource usage and better compression efficiency

88 of 103

Issue 191: Add API to control encode complexity

Intended use cases:

Allow an application to balance bandwidth/CPU usage

Possible to reduce bandwidth usage without regressing quality at the expense of higher CPU usage. Choice that can depend on both device type and organizational requirements.

Better ability to adapt to type of device

Reduce CPU usage for devices with known thermal-throttling issues
Increase quality for non-constrained devices (e.g. plugged in meeting room devices)

89 of 103

Discussion (End Time: 12:20)

90 of 103

Wrapup and Next Steps

Start Time: 12:20

End Time: 12:30

91 of 103

Next Steps

Content goes here

92 of 103

Spillover

93 of 103

Use Case 2: SFU “Lip Sync”

SFU relays RTP streams from multiple participants
Audio: RTP streams mixed(/forwarded?) on the SFU.
Video: RTP streams forwarded by SFU, rendered on client
SFU terminates/originates RTCP
How does the client sync audio & video from each participant?

RTCP SR NTP timestamp reflects SFU wallclock

Case a: rtpTimestamp forwarded unmodified from capturer
Case b: rtpTimestamp rewritten by SFU

Abs-capture-timestamp RTP header extension provides:

Abs-capture-timestamp: NTP timestamp of the first byte captured in the original sender’s wallclock.

Case a: Abs-capture-timestamp + original rtpTimestamp enables “lip sync” to sender wallclock

Estimated-clock-offset: estimated offset between the SFU wallclock and the original sender/capturer wallclock.

Case b: Abs-capture-timestamp + estimated-clock-offset + SFU rtpTimestamp enables “lip sync” to sender wallclock

94 of 103

Abs-capture-timestamp RTP header extension

95 of 103

Abs-capture-timestamp RTP header extension

96 of 103

Use Case 3: SFU Audio Sync

Same topology as Use Case 2, but goal is to sync audio from multiple participants.

Example: Ukulele Lockdown

How does the client sync audio from multiple participants?

RTCP SR NTP timestamp reflects SFU wallclock

Case a: rtpTimestamp forwarded unmodified from capturer
Case b: rtpTimestamp rewritten by SFU

Abs-capture-timestamp RTP header extension provides:

Abs-capture-timestamp: NTP timestamp of the first byte captured in the original sender’s wallclock.
Estimated-clock-offset: estimated offset between the SFU wallclock and the original sender/capturer wallclock.

Case a: Capturer rtpTimestamp + abs-capture-timestamp + estimated-clock-offset aligns streams on the SFU wallclock
Case b: SFU rtpTimestamp + estimated-clock-offset aligns streams on SFU wallclock

Stream playout delay adjusted based on max audio playout point (lowest delay)
Stream playout delay adjusted based on max audio/video playout point (max delay)