1 of 103

W3C WebRTC WG

TPAC Meeting

September 24, 2024

09:00 - 12:30

1

Chairs: Bernard Aboba

Harald Alvestrand

Jan-Ivar Bruaroey

2 of 103

W3C WG IPR Policy

2

3 of 103

W3C Code of Conduct

  • This meeting operates under W3C Code of Ethics and Professional Conduct
  • We're all passionate about improving WebRTC and the Web, but let's all keep the conversations cordial and professional

3

4 of 103

Safety Reminders

While attending TPAC, follow the health rules:

  • Masks and daily testing are left to individual choice

Please be aware of and respect the personal boundaries of your fellow participants

For information about mask and test availability, what to do if you are not feeling well, and what to do if you test positive for COVID, see:

https://www.w3.org/2024/09/TPAC/health.html

4

5 of 103

About Today’s Meeting

5

6 of 103

Other TPAC 2024 Meetings of Interest

  • Tuesday, September 24, 2024
    • 16:30 -18:00 WEBRTC WG/SCCG Joint Meeting
  • Wednesday, September 25, 2024
    • Breakout Sessions
    • Realtime Web Track:
      • RtpTransport (10:00 - 11:00)
      • Evolved Video Encoding w/WebCodecs (11:15 - 12:15)
      • Sync on Web (13:15 - 14:15)
  • Thursday, September 26, 2024
    • 14:00 - 16:00 WEBRTC WG/MEDIA WG Joint Meeting

6

7 of 103

For Discussion Today

  • 09:10 - 09:30 State of the WG (Harald)
  • 09:30 - 09:50 WebRTC-PC Recycling to REC (Dom)
  • 09:50 - 10:10 Codec Issues (Henrik Bostrom & Harald)
  • 10:10 - 10:30 IceTransport Extensions (Sameer)
  • 10:30 - 11:00 Break
  • 11:00 - 11:20 Encoded Source (Guido Urdaneta)
  • 11:20 - 11:40 Timing Model (Bernard, Markus & Youenn)
  • 11:40 - 12:00 RtpTransport (Peter Thatcher)
  • 12:00 - 12:20 Corruption Stats + Encoder Complexity (Erik Språng)
  • 12:20 - 12:30 Wrap-up and Next Steps (Chairs)

Time control:

  • A warning will be given 2 minutes before time is up.
  • Once time has elapsed we will move on to the next item.

7

8 of 103

Virtual Meeting Tips (Zoom)

  • Both local and remote participants need to be on irc.w3.org channel #webrtc.
  • Use “+q” in irc to get into the speaker queue and “-q” to get out of the speaker queue.
  • To try out WebCodecs over RTCDatachannel (not RTP!) join using a Browser.
  • Please use headphones when speaking to avoid echo.
  • Please wait for microphone access to be granted before speaking.
  • Please state your full name before speaking.

8

9 of 103

State of the WG (Harald)

Start Time: 09:10 AM

End Time: 09:30 AM

9

10 of 103

External Environment

  • WebRTC over RTP is the dominant browser VC platform (and decent chunks of non-browser)
  • WebRTC over RTP is being used in many niche applications (ex WHIP for recording)
  • Explorations of other protocols (MOQ, WebCodecs over WebTransport) ongoing, have achieved some deployment traction (Zoom)
  • Interoperability is mostly “proprietary app on multiple browsers”

10

11 of 103

Activity since TPAC 2023

  • Repo activity
    • Mediacapture-main
      • Getting ready for REC (still) - removing non-implemented features
    • Mediacapture-extensions
      • Holding pen for new ideas
    • Webrtc-pc
      • Merging some things from -extensions (when implemented)
        • Rule for merge is “1 implementation and 1 promise”
    • Webrtc-extensions
      • Holding pen for new ideas
    • Webrtc-stats
      • Living Standard-like. Simplification and removal of Old Stuff
    • Webrtc-nv-use-cases
      • Restructure and attempt to make useful
    • Webrtc-Encoded-Transform
      • Clone and metadata -> new product release!

11

12 of 103

Implementation activity

  • Lots of features implemented in multiple browsers
    • Typically >1700 of 2093 tests in webrtc pass
      • A year ago, approx 1000 of 1200 passed
    • Allowed moving stuff from -extensions to main
  • Lots of work to extend test coverage
    • A strict policy of “all changes must have tests” helps

12

13 of 103

Things that seem stable (and used)

  • Mediacapture-transform
    • But see timing discussion
  • Mediacapture-record
    • Minor tweaking recently (initial timestamp)
  • Mediacapture-fromelement
  • Mediacapture-image
  • Webtc-priority
    • (first SCTP implementation waiting for a customer)
  • Mst-content-hint
  • WebRTC-SVC

13

14 of 103

Major new or expanded topics

  • Webrtc-encoded-transform
    • New functionality desired
  • Webrtc-ice
    • New direction on how to control pursued
  • Platform processing for effects and faces
    • Being pursued in some sync with Media WG
  • Screen capture
    • Largely pursued in SCCG community group

14

15 of 103

Discussion (End Time: 09:30)

15

16 of 103

WebRTC-PC Recycling to REC (Dom)

Start Time: 09:30 AM

End Time: 09:50 AM

16

17 of 103

  • Initial WebRTC Recommendation released in January 2021
  • Since then, 47+ substantive amendments (correction & addition) have been identified and approved
  • We published 26 of them as candidate amendments in March 2023
  • Among these 47, 28 have demonstrated implementation and interoperability
  • PROPOSAL: Bring these 28 amendments to Last Call for Proposed Amendments so they get integrated into a normative republished Recommendation
    • Starts a 60 days review period
    • Starts an Advisory Committee review
    • Establishes a Patent Review Draft

17

18 of 103

Amendments aligning with existing implementations

  • Set default values of the RTCConfiguration dictionary, aligning it with current implementations - section 4.2.1 RTCConfiguration Dictionary (PR #2691)
  • Update RTCIceGatheringState, RTCPeerConnectionState, RTCIceConnectionState to clarify the relevant transport it represents - section 4.3.2 RTCIceGatheringState Enum (PR #2680)
  • No longer queue a task in the determine DTMF algorithm - section 7.3 canInsertDTMF algorithm (PR #2742)
  • Align MTI stats with implementations - section 8.6 Mandatory To Implement Stats (PR #2744, PR #2748)
  • Clarify simulcast envelope is determined by negotiation - section 5.4.1 Simulcast functionality (PR #2760)
  • Update explanation of simulcast envelope. - section 5.4.1 Simulcast functionality (PR #2814)
  • Fix ambiguities in the setCodecPreferences() algorithm - section Methods (PR #2847)
  • Reject setParameters(), replaceTrack(), & insertDTMF() after stop() - section Methods (PR #2829)
  • Make removeTrack() a no-op after transceiver.stop() - section Methods (PR #2875)
  • Don't fire connectionstatechange on pc.close() - section Update the connection state (PR #2876)
  • Fix binaryType setter requirements - section Attributes (PR #2909)
  • Change the default value of binaryType - section Attributes (PR #2913)

18

19 of 103

Untestable amendments

Unobservable IDL changes:

  • Replace DOMTimeStamp in the definition of the RTCCertificateExpiration.expires and of RTCCertificate.expires, and change its origin to certificate creation time - section 4.9.1 RTCCertificateExpiration Dictionary (PR #2686, PR #2700)
  • Remove unused RTCRtpDecodingParameters dictionary - section 5.2.5 RTCRtpDecodingParameters Dictionary (PR #2753)
  • Remove single-value RTCIceCredentialType enum - section 4.2.2 RTCIceCredentialType Enum (PR #2767)
  • Create RTCRtpCodec dictionary and reuse in RTCRtpCodecCapability and RTCRtpCodecParameters definitions - section 5.2.9 RTCRtpCodecParameters Dictionary (PR #2834)
  • Make RTCRtpHeaderExtensionCapability.uri required - section 5.2.12 RTCRtpHeaderExtensionCapability Dictionary (PR #2841)
  • Add empty setParameterOptions as second argument to setParameters for extensibility - section RTCSetParameterOptions Dictionary (PR #2885)

Untestable in WPT:

  • Allow an implementation-defined limit to the number of configured ICE Servers - section 4.2.1 RTCConfiguration Dictionary (PR #2679)
  • Allow encoder resolution alignment in scaleResolutionDownBy. - section 5. RTP Media API (PR #2808)

19

20 of 103

Interoperable modifications

  • Ensure the connecting state happens whenever a ICE or DTLS transport is new - section 4.3.3 RTCPeerConnectionState Enum (PR #2687)
  • Validate ICE transport settings upfront when setting a configuration - section 4.4.1.6 Set the configuration (PR #2689)
  • Put ICE transport connection in failed state when no candidates are received - section 5.6.4 RTCIceTransportState Enum (PR #2704)
  • Add RTCRtpEncodingParameters.maxFramerate - section Methods (PR #2785)
  • Remove RTCRtpEncodingParameters.scaleResolutionDownBy for audio - section Methods (PR #2772, PR #2799)
  • Default RTCRtpEncodingParameters.scaleResolutionDownBy to 1 for video - section Methods (PR #2772)
  • setCodecPreferences only takes into account receive codecs - section Methods (PR #2926)
  • Add control for the receiver's jitter buffer - section RTCRtpReceiver Interface (PR #2953)

20

21 of 103

Next steps

  • Consensus of WG to publish updated draft
  • AC Review + wide review
  • If successful, republish Recommendations with remaining 17 candidate corrections (and new ones)

(this can be done every 6 months if useful)

21

22 of 103

Discussion (End Time: 09:50)

22

23 of 103

Codec Issues (Henrik Boström & Harald)

Start Time: 09:50 AM

End Time: 10:10 AM

23

24 of 103

Making Codecs Transceiver-specific (Harald)

  • Desired functionality from Encoded Transform: Add custom codecs to SDP negotiation
  • Encoded Transform change discussed in WG meetings from 2023 to April of this year
  • Resulting requirement on webrtc-pc: Codecs need to be specified per transceiver, not per PeerConnection
  • Attempted webrtc-pc change was rolled back because it was incomplete
  • Still working on figuring out details.

24

25 of 103

Transceiver Specific Codecs (Harald)

  • SDP is used to negotiate payload types
  • Payload types are transport-level IDs
  • Payload types can collide for send vs receive (but usually we try to avoid that)
  • SDP on sendrecv is “what we can receive”
  • But where do we specify what we can send?

25

26 of 103

Codec directionality (Henrik)

Codecs on sendrecv m= section are “what we can receive”*

*fine print: While this is what we prefer to receive (not send),�this may still influence what we send! Important for backwards compatibility.

  • JSEP 5.3.1: If preferences are not set, answer MUST use offer’s order.�(This is also RECOMMENDED in RFC 3264 section 6.1)
    • “Offer to send” is a valid use case! ✅
  • JSEP 5.2.1: If preferences are set, offer MUST exclude codecs ∉ preferences.
    • If send-only codecs are not preferred, they’re gone! 😱

Conclusion: Preferences can include both send and receive codecs.

  • Q: How to deal with unidirectional codecs?

26

27 of 103

Codec directionality (H..

RFC 3264 section 5.1 to the rescue…? TL;DR:

  1. SendOnly stream SHOULD indicate codecs used for sending.
  2. RecvOnly stream SHOULD indicate codecs used for receiving.
  3. SendRecv stream SHOULD indicate codecs offerer is willing to�“send and receive with”.

Unicast: codec filter on direction is straightforward.

SendRecv: can we mix sendrecv with recvonly codecs?

  • E.g. {H265, H264} where H265 is recvonly and H264 is sendrecv.
  • Allowing this seems useful.
  • Risk: if answerer can sendrecv H265, it may remove H264 from offer.
    • O/A completes but offerer is unable to send anything.

27

28 of 103

Codec directionality (Henrik)

We should: filter codec preferences based on direction.

Proposal A: Avoid footgun.

  • Only allow sendrecv codecs on sendrecv stream.
  • Encode+decode is always possible, but {H265, H264} is not allowed. ❌

Proposal B: Support “{H265, H264}”.

  • Allow recvonly codec on sendrecv stream. Exclude sendonly.
  • Decode is always possible.
  • Encode is not always possible (e.g. answerer removes H264), but that’s OK.
  • Answerer can add its own recv-only codecs in the answer even if it was not offered. ✅

Proposal C: Max power, max footgun.

  • Allow both sendonly and recvonly. Decode errors possible! App must be smart. 🔥

Proposal D: Change the rules.

  • Relax JSEP 5.2.1 to allow codecs in a=rtpmap you don’t want to receive (not in m=)
  • Non-receive codecs are “I want to know if you can receive this”

28

29 of 103

Discussion (End Time: 10:10)

29

30 of 103

IceTransport Extensions (Sameer)

Start Time: 10:10 AM

End Time: 10:30 AM

30

31 of 103

Issue 209 - App control over ICE checks

Allow an App to observe, influence, and initiate outgoing ICE checks

But first, lots of questions…

  • How do ICE checks work today?
  • Why does it need to change? How would it help the user?
  • Is it possible with existing API and stats?
  • Could this be configuration rather than an API?

And then…

  • What does the API look like?
  • What can and cannot the App control or influence?
  • What does the API usage look like?
  • How could the API be extended in the future?

31

32 of 103

How do ICE checks work today?

  • ICE connectivity checks sent at the beginning of an ICE session
    • ICE agent cycles through the discovered candidate pairs
    • Paced ~50ms apart
    • Retransmitted on timeout with exponential backoff
    • Triggered checks to converge faster on working pairs
    • A selected candidate pair is nominated

  • Keepalives sent every 15 seconds when no RTP/RTCP packets sent

  • Consent renewal sent every 4-6 seconds
    • Consent expires after 30 seconds without renewal
    • Makes keepalives redundant

32

33 of 103

How could things be improved?

  • It is now possible to retain alternate candidate pairs and switch to a different candidate pair during an ICE session
  • Keepalives / consent renewals will continue on inactive pairs

Example scenario:

  • Start off selecting a candidate pair, retain one or more alternate pairs
    • All candidate pairs share a bandwidth-constrained network link
    • …or an alternate pair uses a power-sensitive network device
  • App wants to reduce ICE checks on the alternate pairs
    • Renew consent every 20 seconds instead of every 5 seconds
  • Active candidate pair degrades, App wants to find a good alternative
    • Send ICE checks on all alternate pairs to find lowest RTT

33

34 of 103

Possible with existing API or stats?

  • Currently not possible for an App to influence ICE checks
    • Only possible to control where ICE checks may be sent through the remote candidates or ICE servers made available to the ICE agent

  • RTCIceCandidatePairStats contain some data about ICE checks
    • Latest and total round trip time since the beginning of session
    • Number of ICE check requests sent and received
    • Number of ICE check responses sent and received
    • Number of consent requests sent
    • Possible to calculate average RTT
    • Possible to calculate RTT over a period by calling getStats() frequently
    • Limited data, not suitable for making immediate decisions

34

35 of 103

Possible to express with config?

  • Too many variations and parameters
    • How often to ping active pair
    • How often to ping alternate pairs
    • When and how to reduce checks on some pairs
    • Are all alternate pairs treated the same
    • In what order to check pairs
    • When and how to increase checks on some pairs
    • How many outstanding checks to keep

  • Better to provide a focused API surface and let App come up with its own algorithm, optimized to its use case

35

36 of 103

What is the new API?

partial interface RTCIceTransport {

// Send an ICE check.

Promise<RTCIceCheckRequest> checkCandidatePair(RTCIceCandidatePair pair);

// Fired before ICE agent sends an ICE check, cancellable.

attribute EventHandler /* RTCIceCheckEvent */ onicecandidatepaircheck;

}

interface RTCIceCheckEvent : Event { // Cancellable

readonly attribute RTCIceCandidatePair candidatePair;

// Resolves when the check is actually sent. Rejected => send failure.

readonly attribute Promise<RTCIceCheckRequest> request;

}

interface RTCIceCheckRequest {

readonly attribute ArrayBuffer transactionId;

readonly attribute DOMHighResTimeStamp sentTime;

// Resolves when response is received. Rejected => timeout.

readonly attribute Promise<RTCIceCheckResponse> response;

}

interface RTCIceCheckResponse {

readonly attribute DOMHighResTimeStamp receivedTime;

readonly attribute boolean retransmitted;

// No error => success.

readonly attribute RTCIceCheckResponseError? error;

}

  • Event fired before ICE check is sent, can be canceled
  • Conversely, App can initiate an ICE check
  • A promise resolves when an ICE check is sent
  • Another promise resolves when a response is received or the check times out

36

37 of 103

Mechanics of the API

  • Only STUN binding requests are notified to the App
    • Not other STUN methods eg. Allocate, Refresh, Send or Data indications
    • Not STUN binding indications - these don't generate a response
  • Consent renewals and triggered checks can also be prevented by the App
    • Nothing special in the content of the STUN binding request
  • Retransmits are not notified to the App
    • RTCIceCheckResponse indicates if retransmission occurred
    • Response should not be used to compute RTT, only to gauge reachability
  • Outgoing STUN binding responses cannot be prevented by the App
  • App can only send STUN binding requests, not other STUN methods
    • Always a new STUN transaction, rate-limited
    • Possible to have multiple outstanding checks for a candidate pair

37

38 of 103

How to use the new API?

const pc = …;

const ice = pc.getTransceivers()[0].sender.transport.iceTransport;

ice.onicecandidatepaircheck = async(event) => {

if (shouldNotCheck(event.candidatePair)) {

event.preventDefault(); // prevent a check

return;

}

const request = await event.request;

handleCheck(request);

}

// send a check

const request = await ice.checkCandidatePair(alternatePair);

handleCheck(request);

function handleCheck(request) {

try {

const response = await request.response;

if (response.error) {

// … do something with error

return;

}

const rtt = response.receivedTime - request.sentTime;

if (!response.retransmitted) {

// … do something with rtt

}

} catch(error) {

// … do something with timeout

}

}

38

39 of 103

Future extensibility

  • Read STUN attributes in binding request and response

partial interface RTCIceCheckRequest { partial interface RTCIceCheckResponse {

readonly attribute RTCStunAttributes attributes; readonly attribute RTCStunAttributes attributes;

} }

dictionary RTCStunAttributes { dictionary RTCUnrecognizedStunAttribute {

boolean useCandidate; required ArrayBuffer type;

sequence<ArrayBuffer> unknownAttributes; ArrayBuffer value;

sequence<RTCUnrecognizedStunAttribute> unrecognizedAttributes; }

}

  • Set STUN attributes in an outgoing binding request

Promise<RTCIceCheckRequest> checkCandidatePair(RTCIceCandidatePair pair, optional RTCStunAttributes attributes);

  • Set timeout, max retransmission on an outgoing binding request

Promise<RTCIceCheckRequest> checkCandidatePair(RTCIceCandidatePair pair, optional double timeout, optional unsigned short maxRetransmissions);

39

40 of 103

Discussion (End Time: 10:30)

40

41 of 103

Break

Start Time: 10:30 AM

End Time: 11:00 AM

41

42 of 103

Encoded Source (Guido Urdaneta)

Start Time: 11:00 AM

End Time: 11:20 AM

42

43 of 103

Use Case

43

44 of 103

Scenario

  • A few nodes read from a reliable server, forward to a P2P network with thousands of nodes
  • Communication with server is expensive
  • Communication between nodes is cheap
  • Nodes are generally unreliable (can join/leave at any time)
  • P2P network topology with redundant paths for reliability (Fan-in and fan-out)

44

Server

45 of 103

RTCRtpEncodedSource proposal

  • Input PC1 and PC2 provide the same media
  • Output masks failures from either Input PC1 or PC2, not timeouts
  • There might be more than 2 input or output peers

45

Input PC1

Input PC2

Output PC1

P2P Node

Use Encoded Transform to read frames from Input PCs

Custom processing to produce output frames (discard duplicates, adjust metadata, append app-specific metadata,...)

Write to Output PCs using

Encoded Source

To

Peer O1

From

Peer I1

From

Peer I2

Output PC1

To

Peer O2

46 of 103

Original proposal

  • A few months ago we made a proposal patterned after RTCRtpEncodedSource
    • Similar to a single-side encoded "transform"
  • Updated proposal:
    • Keep the same model
    • Incorporate feedback from the WG and developers

46

47 of 103

Feedback from original proposal

  • Encoded Source allows more freedom than Encoded Transform
  • Easier to make mistakes
  • Less connected to internal control loops
  • Requires more signals (bandwidth, error handling)

47

48 of 103

Basic example

// main.js

const worker = new Worker('worker.js');

// Let relayPCs be the set of PCs used to relay frames.

for (let pc of relayPCs) {

const [sender] = pc.getSenders();

await sender.createEncodedSource(worker); // similar to replaceTrack()

}

// Let recvPc1, recvPc2 be the receiving PCs.

recvPc1.ontrack = ({receiver}) =>

receiver.transform = new RTCRtpScriptTransform(worker);

recvPc2.ontrack = ({receiver}) =>

receiver.transform = new RTCRtpScriptTransform(worker);

48

49 of 103

Basic example

// worker.js

let sourceWriters = [];

onrtcsenderencodedsource = ({source: {writable}}) => {

sourceWriters.push(writable.getWriter());

}

onrtctransform = async ({transformer: {readable, writable, options}}) => {

function transform(frame, controller) {

if (shouldForward(frame)) { // app-defined (e.g., drop duplicates)

for (let writer of sourceWriters)

writer.write(getOutputFrame(frame)); // app-defined (e.g., adjust metadata)

}

controller.enqueue(frame); // put original frame back in receiver

}

await readable.pipeThrough(new TransformStream({transform})).pipeTo(writable);

}

49

50 of 103

Errors and signals - developer feedback

  • Keyframe requests
  • Bandwidth information
  • Synchronous errors for incorrect frames
    • Timestamps in the past
    • Decreasing frame ID
    • Handled by failing write
  • Other signals
    • Frames dropped before sending (possibly due to lack of bandwidth)
    • Expected queue time before sending (in pacer)

50

51 of 103

Signals - Key Frame requests

  • Event for Keyframe requests
    • Similar to the keyframerequest event in Encoded Transform
  • No need for a generateKeyFrame method (present in Encoded Transform)
    • The application would explicitly write the key frame to the source's writable

[Exposed=DedicatedWorker] interface RTCRtpSenderEncodedSource {

// Accepts RTCRtpEncoded{Video|Audio}Frame, rejects on incorrect frames

readonly attribute WritableStream writable;

attribute EventHandler onkeyframerequest;

};

51

52 of 103

Bandwidth signals

  • Based on Harald's proposal for congestion control
  • Have a field with bandwidth information:
    • Bitrate recommended for the media from this source
    • Available outgoing bitrate for the ICE candidate supporting the transport
      • Already exposed via stats
  • Have an event fire every time there is a significant change in bandwidth information

52

53 of 103

Bandwidth and other signals

[Exposed=DedicatedWorker] interface BandwidthInfo {

readonly attribute long allocatedBitrate; // bits per second

readonly attribute long availableOutgoingBitrate;

};

[Exposed=DedicatedWorker] interface RTCRtpSenderEncodedSource {

...

readonly attribute BandwidthInfo bandwidthInfo;

attribute EventHandler onbandwidthestimate;

readonly attribute unsigned long long droppedFrames;

readonly attribute double expectedSendQueueTime; // milliseconds

};

53

54 of 103

Example

// worker.js

async function maybeRelayFrame(frame, writer, bandwidthInfo) {

// Append extra redundancy data to the payload if there is enough bandwidth

if (bandwidthInfo.allocatedBitrate > kMinBitrateForExtraRedundancyData) {

appendExtraData(frame);

}

await writer.write(frame);

}

54

55 of 103

API Shape

partial interface RTCRtpSender {

Promise<undefined> createEncodedSource(

Worker worker, optional any options, optional sequence<object> transfer);

}

partial interface DedicatedWorkerGlobalScope {

attribute EventHandler onrtcsenderencodedsource;

}

[Exposed=DedicatedWorker] interface RTCRtpSenderEncodedSourceEvent : Event {

readonly attribute RTCRtpSenderEncodedSource encodedSource;

};

55

56 of 103

API Shape

interface BandwidthInfo {

readonly attribute long allocatedBitrate; // bits per second

readonly attribute long availableOutgoingBitrate;

};

[Exposed=DedicatedWorker] interface RTCRtpSenderEncodedSource {

readonly attribute WritableStream writable; // Accepts RTCRtpEncoded{Video|Audio}Frame

attribute EventHandler onkeyframerequest;

readonly attribute BandwidthInfo bandwidthInfo;

attribute EventHandler onbandwidthestimate;

readonly attribute any options;

};

56

57 of 103

Pros and cons

  • Similar pattern as encoded transform.
    • Proven in production
    • Easy to use and understand
  • Good match for SFU-like operations that are frame centric:
    • Zero-timeout, glitch-free forwarding of frames from redundant paths
    • Drop/adjust frames in response to bandwidth issues
    • No re-encoding
  • Requires waiting for a full frame, which introduces extra latency compared with a packet-based API
  • Future: RTCRtpReceiverEncodedSource?
    • Fan-in for the receiver

57

58 of 103

Discussion (End Time: 11:20)

58

59 of 103

Timing Model I (Bernard, Markus & Youenn)

Start Time: 11:20 AM

End Time: 11:40 AM

59

60 of 103

For Discussion Today

  • Mediacapture-transform
    • Issue 87: What is the timestamp value of the VideoFrame/AudioData from a remote track? (Bernard/Markus)
    • Issue 96: What is the impact of timestamp for video frames enqueued in VideoTrackGenerator? (Youennf)
    • Issue 80: Expectations/Requirements for VideoFrame and AudioData timestamps (Bernard/Markus)
    • Issue 86: Playback and sync of tracks created by VideoTrackGenerator (Bernard/Markus)

60

61 of 103

Issue 87: What is the timestamp value of the VideoFrame/AudioData from a remote track? (Bernard/Markus)

  • VideoFrame and AudioData Metadata both include a timestamp attribute.
    • WebCodecs defines these attributes as a “presentation timestamp”
      • encoder/decoder behavior not specified for AudioData
    • RVFC says captureTimestamp is not present for remote tracks
  • Sample Code sheds doubt on these statements
    • Encode a VideoFrame and serialize it with timestamp over the wire.
    • On the receiver, deserialize and decode the EncodedVideoChunk with timestamp.
    • Use VideotrackGenerator to create a MST from the stream of VideoFrames.
    • Call RVFC. Observation:
      • captureTimestamp is set.
      • How? Suggests that timestamp is in fact a capture timestamp!
      • In the WebRTC context, how does this relate to rtpTimestamp?
  • Does the definition of timestamp need to be changed?
    • How does it differ from RVFC captureTime or rtpTimestamp?

61

62 of 103

62

63 of 103

63

64 of 103

64

65 of 103

  • Proposed model
    • VideoTrackGenerator does not use nor modify timestamps
      • In particular, VideoTrackGenerator does not buffer video frames
    • Each track’s source defines how VideoFrame objects are created
      • Including computation of the VideoFrame’s timestamp
    • Each track’s sink defines how to use VideoFrame’s timestamps

  • VideoTrackGenerator spec clarification
    • Define Send clone to track as sending immediately clone to each of the track’s sink
      • Probably in mediacapture-main spec

65

Issue 96: What is the impact of timestamp for video frames enqueued in VideoTrackGenerator? (Youennf)

66 of 103

  • Video track’s sources
    • Capture track: timestamp = capture time, consistent with RVFC
    • WebRTC track: timestamp, receiveTime & rtpTimestamp, consistent with RVFC
    • Canvas track: timestamp = time at wich canvas snapshot is made

  • Video track’s sinks
    • HTMLMediaElement, RTCRtpSender, MediaRecorder
    • By default, timestamp is not used
      • The time at which the frame is submitted is used instead.

66

Issue 96: What is the impact of timestamp for video frames enqueued in VideoTrackGenerator? (Youennf)

67 of 103

Issue 80: Expectations/Requirements for VideoFrame and AudioData timestamps (Bernard)

  • Filed by Chris Cunningham (editor of WebCodecs!) on February 9, 2022.
    • “Is it valid to append multiple VideoFrames or AudioData object with the same timestamp (e.g. timestamp=0) to a MediaStreamTrack? If so what is the behavior? Does the spec describe this?”
  • If VideoTrackGenerator does not use or modify timestamps:
    • Does it pass VideoFrames with duplicate timestamp values on to the MediaStreamTrack?
    • What happens after that (e.g. in HTMLVideoElement)?
      • If timestamp is not used, do all the submitted VideoFrames render?

67

68 of 103

  • Neither Mediacapture-transform nor Mediacapture-main discuss playout points.
  • If Audio and Video MSTs are combined in a MediaStream, what is supposed to happen?
    • Media Capture & Streams, Section 4.1:

68

Issue 86: Playback and sync of tracks created by VideoTrackGenerator (Bernard/Markus)

69 of 103

  • How are the playout points determined?
    • Is timestamp used (e.g. as the capture time)?
      • Are all MSTs in a MediaStream assumed to use the same clock?
      • Is the RVFC attribute receiveTime used?
        • Can be used to calculate sender/receiver offset, Jitter, audio/video playout points.
          • Playout points adjusted to achieve sync.
    • Proposal:
      • PR (to mediacapture-?) indicating the role of timestamp and receiveTime in “lipsyc”.

69

Issue 86: Playback and sync of tracks created by VideoTrackGenerator (Cont’d)

70 of 103

Discussion (End Time: 11:40)

70

71 of 103

RtpTransport (Peter Thatcher)

Start Time: 11:40 AM

End Time: 12:00 PM

71

72 of 103

Reminder about use cases

Control/customization of:

  • Payloads and Header Extensions (Codecs, Data, Metadata)
  • Packetization (WebCodecs, WASM)
  • Reliability (FEC, NACK/RTX)
  • Jitter Buffer
  • Congestion Control (bandwidth estimation, pacing, probing)
  • Rate Control (bitrate allocation, encoder rates)
  • Control/Feedback messages

73 of 103

Example Use Cases

  • Send arbitrary data within the same congestion control as audio/video
  • Do processing (face tracking) on video and attach metadata
  • WebCodecs-based support for:
    • AAC over RTP
    • Control of hardware/software failover
    • Per-frame QP rate control
    • Long-term references (LTRs)
    • Spatial scalability with Layer Refresh (LRR)
  • Implement an audio codec using WASM
  • Implement FEC designed for high-loss scenarios
  • Implement a jitter buffer more suitable to streaming applications
  • Forward RTP packets from one PeerConnection to another
  • Implement a congestion control algorithm using L4S signals
  • Implement RTCP messages like LRR, RPSI, SLI, RTCP-XR

74 of 103

Status Update

  • A while ago
    • Agreement in the Working Group to add "piecemeal" (incremental) low-level RTP/RTCP functionality
  • Progress made since then

75 of 103

Things That Have Been Figured Out

  • Transferability/Workers
    • RtpTransport requires DedicatedWorker
    • RtpSendStream/RtpReceiveStream are Transferable
  • Processing lots of packets (optional batch processing)
    • Events have no payloads
    • You read out the event payloads with pattern of "readFoo(long maxNumber)"
  • Custom BWE/Pacing/Probing
    • JS told what RTP packets would have been sent
    • JS can send packets with a particular send time
    • JS told what RTP packets have been sent and what RTCP feedback has been received
    • JS told if a packet is dropped because of "overuse" (according to a lenient congestion controller)
  • BYOB

76 of 103

Things In Progress

  • Custom NACK/RTX (Explainer coming)
  • Simulcast (per-MID vs per-RID objects in the API)
  • Going "SDP-less" (Explainer coming)

77 of 103

Conclusion

  • We're making good progress
  • Still much work to do
  • There's a breakout session tomorrow! �(10-11am in California A)

78 of 103

Discussion (End Time: 12:00)

78

79 of 103

Corruption Stats + Encoder Complexity (Erik Språng)

Start Time: 12:00 PM

End Time: 12:20 PM

79

80 of 103

Stats issue 787: Corruption Likelihood Metric

The purpose is to provide a metric that indicates the estimated probability that an inbound video stream is experiencing corruptions.

We’re targeting outright bugs that cause visual artifacts that is otherwise not visible in any existing stats, it is not intended as general quality metric. The typical use case is finding problems early and be able to root cause it to e.g. browser versions, hardware setups, configuration/experiment rollouts - without having to rely on user feedback reports.

80

double totalCorruptionProbability;

double totalSquaredCorruptionProbability;

unsigned long long corruptionMeasurements;

81 of 103

Stats issue 787: Corruption Likelihood Metric

Example implementation: http://www.webrtc.org/experiments/rtp-hdrext/corruption-detection

RTP header extension used as side-channel, transmitting randomly selected image samples for validation

81

82 of 103

Stats issue 787: Corruption Likelihood Metric

82

Packet

Extension

With the encoded images, piggy-back a few (13 with one-byte extensions) randomly selected samples as raw values.

83 of 103

Stats issue 787: Corruption Likelihood Metric

83

Packet

Extension

Compare to the raw decoded samples values on the receive side.

84 of 103

Stats issue 787: Corruption Likelihood Metric

84

Large QP - average over large area

Low QP - average over small area

The extension has fields to indicate filter size and allowed error thresholds, so that expected distortions from lossy compression can be suppressed.

85 of 103

Stats issue 787: Corruption Likelihood Metric

85

The stats value is not intended to be tightly coupled to this implementation.

Described as probability of corruption in range [0.0, 1.0] to be generic enough to be used with other implementations.

Future iterations could include:

  • Other forms of more precise side-channel information.
  • Receive-side only implementations
    • Using e.g. natural image statistics or ML models to detect issues

86 of 103

Stats issue 787: Corruption Likelihood Metric

86

Relevant links:

87 of 103

Issue 191: Add API to control encode complexity

87

Add encodeComplexityMode to RTCRtpEncodingParameters:

enum RTCEncodeComplexityMode {

"low",

"normal",

"high"

};

partial dictionary RTCRtpEncodingParameters {

RTCEncodeComplexityMode encodeComplexityMode = "normal";

};

Specifies the encoding complexity mode relative to "normal" mode:

  • "low" mode results in lower device resource usage and worse compression efficiency
  • "high" mode results in higher device resource usage and better compression efficiency

88 of 103

Issue 191: Add API to control encode complexity

88

Intended use cases:

  • Allow an application to balance bandwidth/CPU usage
    • Possible to reduce bandwidth usage without regressing quality at the expense of higher CPU usage. Choice that can depend on both device type and organizational requirements.

  • Better ability to adapt to type of device
    • Reduce CPU usage for devices with known thermal-throttling issues
    • Increase quality for non-constrained devices (e.g. plugged in meeting room devices)

89 of 103

Discussion (End Time: 12:20)

89

90 of 103

Wrapup and Next Steps

Start Time: 12:20

End Time: 12:30

90

91 of 103

Next Steps

  • Content goes here

91

92 of 103

Spillover

92

93 of 103

Use Case 2: SFU “Lip Sync”

  • SFU relays RTP streams from multiple participants
  • Audio: RTP streams mixed(/forwarded?) on the SFU.
  • Video: RTP streams forwarded by SFU, rendered on client
  • SFU terminates/originates RTCP
  • How does the client sync audio & video from each participant?
    • RTCP SR NTP timestamp reflects SFU wallclock
      • Case a: rtpTimestamp forwarded unmodified from capturer
      • Case b: rtpTimestamp rewritten by SFU
    • Abs-capture-timestamp RTP header extension provides:
      • Abs-capture-timestamp: NTP timestamp of the first byte captured in the original sender’s wallclock.
        • Case a: Abs-capture-timestamp + original rtpTimestamp enables “lip sync” to sender wallclock
      • Estimated-clock-offset: estimated offset between the SFU wallclock and the original sender/capturer wallclock.
        • Case b: Abs-capture-timestamp + estimated-clock-offset + SFU rtpTimestamp enables “lip sync” to sender wallclock

93

94 of 103

94

95 of 103

95

96 of 103

Use Case 3: SFU Audio Sync

  • Same topology as Use Case 2, but goal is to sync audio from multiple participants.
    • Example: Ukulele Lockdown
  • How does the client sync audio from multiple participants?
    • RTCP SR NTP timestamp reflects SFU wallclock
      • Case a: rtpTimestamp forwarded unmodified from capturer
      • Case b: rtpTimestamp rewritten by SFU
    • Abs-capture-timestamp RTP header extension provides:
      • Abs-capture-timestamp: NTP timestamp of the first byte captured in the original sender’s wallclock.
      • Estimated-clock-offset: estimated offset between the SFU wallclock and the original sender/capturer wallclock.
        • Case a: Capturer rtpTimestamp + abs-capture-timestamp + estimated-clock-offset aligns streams on the SFU wallclock
        • Case b: SFU rtpTimestamp + estimated-clock-offset aligns streams on SFU wallclock
      • Stream playout delay adjusted based on max audio playout point (lowest delay)
      • Stream playout delay adjusted based on max audio/video playout point (max delay)

96

97 of 103

Timing Use Case II: BYOC

  • AudioFrames need to be rendered every ptime ms
  • Bring your own Audio Codec (WASM)
  • WASM audio encoder takes AudioData as input, produces EncodedAudioChunks as output.
  • WASM decoder takes EncodedAudioChunks as input, produces AudioFrames as output.
    • Concealment (and internal FEC/RED) triggered by feeding EncodedAudioChunks with no AudioData to decoder.

97

98 of 103

98

99 of 103

99

100 of 103

100

101 of 103

101

102 of 103

102

103 of 103

Thank you

Special thanks to:

WG Participants, Editors & Chairs

103