1 of 37

W3C WebRTC

WG Meeting

July 22, 2021

10:00 AM - Noon Pacific Time

1

Chairs: Bernard Aboba

Harald Alvestrand

Jan-Ivar Bruaroey

2 of 37

W3C WG IPR Policy

2

3 of 37

Welcome!

  • Welcome to the July 2021 interim meeting of the W3C WebRTC WG, at which we will cover:
    • MediaStreamTrack transfer
    • Mediacapture-transform alternative proposal
    • Screen Capture

3

4 of 37

About this Virtual Meeting

4

5 of 37

Understanding Document Status

  • Hosting within the W3C repo does not imply adoption by the WG.
    • WG adoption requires a Call for Adoption (CfA) on the mailing list.
  • Editor’s drafts do not represent WG consensus.
    • WG drafts do imply consensus, once they’re confirmed by a Call for Consensus (CfC) on the mailing list.

5

6 of 37

W3C Code of Conduct

  • This meeting operates under W3C Code of Ethics and Professional Conduct
  • We're all passionate about improving WebRTC and the Web, but let's all keep the conversations cordial and professional

6

7 of 37

Virtual Interim Meeting Tips

This session is being recorded

  • Type +q and -q in the Google Meet chat to get into and out of the speaker queue.
  • Please use headphones when speaking to avoid echo.
  • Please wait for microphone access to be granted before speaking.
  • Please state your full name before speaking.
  • Poll mechanism may be used to gauge the “sense of the room”.

7

8 of 37

Issues for Discussion Today

  • 10:10 AM - 10:25 AM MediaStreamTrack Transfer
    • Slides by Youenn (10:10 - 10:15)
    • Discussion (10:15 - 10:25 AM)
  • 10:25 AM - 11:00 AM Mediacapture-transform Callback Proposal (Youenn)
    • Slides by Youenn (10:25 - 10:45)
    • Discussion (10:45 - 11:00)
  • 11:00 AM - 11:30 AM Screen Capture (Jan-Ivar)
    • Slides by Jan-Ivar (11:00 - 11:15)
    • Discussion (11:15 - 11:30)
  • 11:30 AM - 12:00 PM Wrap-up and Next Steps

Time control:

  • A warning will be given 2 minutes before time is up.
  • Once time has elapsed we will move on to the next item.

8

9 of 37

#29 MediaStreamTrack Transfer: Summary

  • 05/21 interim meeting: 'transferable' option presented
    • 'transferable' option quote from 05/21 interim meeting
      • Default operation is “move” - original track is destroyed; if needed, must use “.clone()”
      • Lifetime strongly tied to original document
    • Conservative lifetime management
      • Works with all existing MediaStreamTrack sources
      • Works with existing security and privacy infrastructure
      • Works with all implementations
  • How to complete specification?
    • Core algorithm already specified in mediacapture-extensions
      • PR 21 merged
      • Some pieces have not landed yet.

9

10 of 37

#29 MediaStreamTrack Transfer: editorial finalization

  • Step 1: Clarify that, by default, all sources are tied to their creation context
    • Existing User Agent behavior
    • Leave the door open to new types of source with a different lifetime
    • PR#805 in mediacapture-main�
  • Step 2: Clarify MediaStreamTrack transfer behavior derived from step 1
    • If original document goes away, transferred track gets ended
    • Leave the door open to new types of source with a different lifetime
    • PR#30 in mediacapture-extensions

10

11 of 37

Discussion (10:15 to 10:25)

  • How to determine consensus?
    • Slides at a meeting do not imply consensus.
    • Can issue a CfC on individual issues, or for promotion of MediaCapture-Extensions to WG draft.
  • Questions?
  • Opinions?

11

12 of 37

Mediacapture-transform Alternative Proposal

(10:25 - 10:45)

12

13 of 37

#23 Towards more JS in MediaStreamTrack

  • Goal 1: Enable safe & efficient access to MediaStreamTrack VideoFrame(s)
    • Head detection, object tracking…�
  • Goal 2: Enable transforming a native video MediaStreamTrack
    • CameraTrack → background blur → TransformedTrack
    • Use TransformedTrack as if it was CameraTrack
      • Including CameraTrack behavior
        • Muted, enabled, applyConstraints…�
  • Proposed API is illustrative

13

14 of 37

#23 Why only focusing on video?

  • Audio processing best practice != ReadableStream<AudioBuffer>
    • Store audio data in a ring buffer (SharedArrayBuffer)
    • I/O in worklet, audio processing in worker
      • The same worker can do video processing for synchronized A+V applications
    • Sources
      • https://developers.google.com/web/updates/2018/06/audio-worklet-design-pattern
        • "It uses the Audio Worklet as a simple "audio sink" and does everything in the Worker"
      • https://github.com/padenot/ringbuf.js/
        • "Sending audio from a non-real-time thread to a real-time thread is sometimes useful"
        • "The opposite is very useful"
  • Uncertainty in exposing a built-in audio/video generic mechanism
    • Chances are high it will be suboptimal compared to app-specific
    • This proposal supports app-specifc audio/video mechanisms
  • We can add a generic mechanism based on this proposal later on
    • If we find out we need it
    • In the meantime, worklet fills the gap as it can shim MSTP (& more)

14

15 of 37

#23 Why not using WhatWG streams?

  • MediaStreamTrack is an augmented ReadableStream<VideoFrame>
    • Manages more than a queue of VideoFrame(s)
      • muted/unmuted, enabled, constraints, capabilities, settings...
    • BUT MediaStreamTrack lacks a reader and a JS constructor
    • Let’s add this missing API!
      • It is best to add direct API instead of lossy converters
        • App-specific lossy converters can be written in JS�
  • Avoid WhatWG Streams edge cases
    • Identified by WebCodec team for WebCodec API
    • Identified by WebRTC WG members for MediaStreamTrack
      • buffering, teeing, 'optimized-but-underspecified' transfering
  • Stay consistent with tightly related APIs
    • WebCodec and WebAudio

15

16 of 37

#23 Goal 1: Reading a video MediaStreamTrack (1/3)

  • Do we want an equivalent to ReadableStream Reader?
    • Not really, reader.closed ↔ track.onended, cancel/stop
    • We only need reader.read()
      • Or an HTMLVideoElement.requestVideoFrameCallback equivalent�
  • Proposal: add 1 new callback to get video frames
    • A la requestVideoFrameCallback
    • Keep using existing MediaStreamTrack API for state management�

16

async function transform(frame) {� ...�}�// start processing�track.processVideoFrame = transform(frame)�// finish processing�track.processVideoFrame = null;�

partial interface MediaStreamTrack {� [Exposed=Worker] attribute VideoFrameCallback processVideoFrame;�};�callback VideoFrameCallback = Promise<undefined>(VideoFrame);�

17 of 37

#23 Goal 1: Reading a video MediaStreamTrack (2/3)

  • Callback returns a promise. Once promise is settled:
    • User Agent can call the callback again
      • Enable backpressure
    • VideoFrame is closed automatically
      • Control lifetime of exposed VideoFrame�
  • Consequences
    • If promise takes too much time to resolve, frames are dropped
    • If application needs to store a frame, it has to clone it explicitly�
  • Easy synchronization with MediaStreamTrack events
    • Muted event → callback no longer called until unmuted event fires
    • Ended event → callback no longer called

17

async function transform(f) {� // Use ‘f’ as long as you are in� ...� // ‘f’ will be closed when exiting�}��track.processVideoFrame = (f) => {� // transform returns a promise� return transform(f);�}�

18 of 37

#23 Goal 1: Reading a video MediaStreamTrack (3/3)

  • Comparison between callback vs. stream based approaches

18

async function transform(frame) {� webCodecEncoder.encode(frame);�}��track.processVideoFrame = transform;

async function transform(frame) {� webCodecEncoder.encode(frame);�}��const processor = new MediaStreamTrackProcess(track);�const reader = processor.readable.getReader();�let chunk = await reader.read();�while (!chunk.done) {� await transform(chunk.data);� // Probably need try/catch to ensure frame gets always closed� chunk.data.close();� chunk = await reader.read();�};

19 of 37

#23 Goal 2: MediaStreamTrack JS Source - level 1

  • Construct a MediaStreamTrack from a JS source
    • Reuse ReadableStream JS constructor model�

19

const stream = await getUserMedia({video: true});�const originalTrack = stream.getVideoTracks()[0];��// originalTrack --> transform --> transformedTrack�const source = {� start: (controller) => {� originalTrack.processVideoFrame = async (frame) => {� // Tranform frame from originalTrack� const newFrame = await transform(frame);� // Enqueue newFrame in transformedTrack pipeline� controller.enqueue(newFrame);� };� // Stop track when originalTrack is stopped� originalTrack.onended = () => controller.stop();� },� // When track is stopped, stop frame transforming� stop: () => originalTrack.processVideoFrame = null�};�const transformedTrack = MediaStreamTrack.createVideoTrack(source);��// Use transformedTrack instead of originalTrack�pc.addTrack(transformedTrack, stream);

partial interface MediaStreamTrack {� [Exposed=Worker] static MediaStreamTrack� createVideoTrack(VideoSource source);�};��dictionary VideoSource {� // Called synchronously by createVideoTrack� VideoSourceStartCallback start;� // Called when track.stop is called� VideoSourceStopCallback stop;�};��[Exposed=Worker]�interface VideoSourceController {� undefined enqueue(VideoFrame frame);� undefined stop();�};��callback VideoSourceStartCallback =� any(VideoSourceController);�callback VideoSourceStopCallback = any();�

20 of 37

#23 MediaStreamTrack & WhatWG streams

  • Integration with WhatWG streams is easy!
    • Just a few lines of tweakable code

20

// MediaStreamTrackGenerator equivalent��function createMyMediaStreamTrackGenerator() {� const controllers = { };� // Create a MediaStreamTrack object� const generatedTrack = MediaStreamTrack.createVideoTrack({� start: (c) => controllers.track = c,� stop: () => controllers.stream.error()� });// Create a WritableStream� generatedTrack.writable = new WritableStream({� start : (c) => controllers.stream = c,� write : (f) => controllers.track.enqueue(f),� close : () => controllers.track.stop(),� abort : () => controllers.track.stop()� });� return generatedTrack;�}

// MediaStreamTrackProcessor equivalent��class MyMediaStreamTrackProcessor {� constructor(track) {� this.readable = new ReadableStream({� start : (c) => {� track.onended = () => c.close();� const p = new Promise(r => this.firstFrame = r);� track.processVideoFrame = (frame) => {� c.enqueue(frame.clone());� this.firstFrame();� return new Promise(r => this.nextFrame = r);� };� return p;� },� pull : (c) => this.nextFrame(),� close : () => track.processVideoFrame = null,� });� }�}

21 of 37

#23 Goal 2: MediaStreamTrack JS Source - level 1

  • What about directly passing a WritableStream to createVideoTrack?��

  • Passing a WritableStream to createVideoTrack is not enough
    • Need to handle more than VideoFrame(s)
      • For instance muted/unmuted MediaStreamTrack state
        • Let's go to level 2!�

21

MediaStreamTrack.createVideoTrack(VideoTrackSource source)

vs.

MediaStreamTrack.createVideoTrack(WritableStream stream)

22 of 37

#23 Goal 2: MediaStreamTrack JS Source - level 2

  • Support for muted/unmuted
    • Use case: if original track gets muted, mute the transformed track

  • A single control place, natural to use
    • Controller enqueues frames
    • Controller can mute/unmute its track
    • Controller is muted → controller does not enqueue frames

22

const transformedTrack = MediaStreamTrack.createVideoTrack({� start: (c) => {� originalTrack.processVideoFrame = transform;� // Handle camera track muted state.� originalTrack.onmuted = () => c.muted = true;� originalTrack.onunmuted = () => c.muted = false;� }�});�...�transformedTrack.onmuted = () => { ... };�pc.addTrack(transformedTrack, stream);

partial interface VideoSourceController {� attribute boolean muted;�};�

23 of 37

#23 Goal 2: MediaStreamTrack JS Source - level 3

  • Let’s start simple, but make sure we can easily extend the API�
  • Support for track.enabled changing to true or to false
    • Use case: if web app disables transformed track, disable the original track�
  • Support for capabilities, constraints and settings
    • Use case: if calling applyConstraints on the transformed track, apply the constraints to the original track �

23

24 of 37

#23 MediaStreamTrack JS Source - the whole thing!

24

// video-call-with-a-dragon-head.html�async function getCameraTrackWithDragonHead()�{� const stream = await� navigator.mediaDevices.getUserMedia({video:true});� const source = stream.getVideoTracks()[0];� const worker = new Worker("dragon-head-worker.js");� let resolve;� const promise = new Promise(r => resolve = r);� worker.onmessage = event => resolve(event.data.track);� worker.postMessage({track:source}, [source]);� return promise;�}��const track = await getCameraTrackWithDragonHead();��// Use track as if coming from getUserMedia, including:�// - Using clone, applyConstraints et al�// - Receiving muted/unmuted events from source�// - Set enabled=true/false, propagating to source�...�track.onmuted = () => { ... };

// dragon-head-worker.js�onmessage = (event) => {� const track = event.data.track;� const transformed = MediaStreamTrack.createVideoTrack({� start: (c) => {� this.controller = c;� track.onended = () => c.stop(); // transform then enqueue each camera frame� track.processVideoFrame =� async (f) => c.enqueue(await transformHead(f));� // propagate camera muted state to track� track.onmuted = () => c.muted = true;� track.onunmuted = () => c.muted = false;� // propagate camera settings to track� c.capabilities = track.capabilities();� c.constraints = track.constraints();� c.settings = track.settings();� },� applyConstraints: async (constraints) => {� // propagate applyConstraints to camera� await track.applyConstraints(constraints);� this.controller.constraints = track.constraints();� this.controller.settings = track.settings();� },� // propagate enabled to camera� enabledChanged: (value) => track.enabled = value,� stop: () => track.processVideoFrame = null� });� self.postMessage({track:transformed}, [transformed]);�});

25 of 37

#23 Beyond Goal 2

  • Extend API to handle native sink backpressure signals
    • HTMLMediaElement, RTCPeerConnection, WebAudio
    • Maybe, we need to do further investigation
    • Easy to support should there be a need�
  • Extend API to handle transforms
    • In particular native transforms like synthetic backgrounds
    • MediaStreamTrack → MediaStreamTrack
    • WASM/WebGPU (or even JS) dedicated transforms
      • Help authoring & use of good practices
        • Remove boiler plate code by safe & efficient code
      • Possibility for User Agent optimizations
        • Zero memory copy

25

26 of 37

#23 Tentative conclusion

  • Simple: 2 MediaStreamTrack methods + 1 new interface
    • Easy to learn & use, blends well with existing APIs�
  • Safe & efficient
    • No buffering by default, implicit VideoFrame closing
    • Worker by default, at least initially until we gather more experience�
  • Powerful
    • Reduced API set for most useful functionality
    • Full MediaStreamTrack shiming support�
  • Flexible
    • Clear path towards native transforms, zero copy...
    • WhatWG stream bridge with a few lines of code

26

27 of 37

Discussion (10:45 - 11:00)

  • Questions?
  • Opinions?

27

28 of 37

Screen Capture (Jan-Ivar)

  • #182 - Recognize safer & better-integrated web presentations in getDisplayMedia
  • #158 - Add ability to crop a MediaStream obtained through the share-this-tab API

28

29 of 37

Screen-capture

#1 use case:

“Screen sharing

Using WebRTC” =

VC Presentations

2021 =

web presentations

#1 use case is�unsafe!

We must fix this!

30 of 37

From presentation I gave @ W3C 2021 AC Meeting

“W3C WebRTC Working Group - April 2021 Update” youtu.be/watch?v=PQ5dGmqeuQI&t=419s

31 of 37

#182 Safer presentations even in getDisplayMedia

Today’s choices, “Entire Screen”, “Window”, “Tab” are all unsafe!*�*Over-sharing (desktop, accidental back button or tab flip), even active malicious attacks on web’s same-origin policy

New (safe) choice: “Web Page”

  • TLBC w/site-isolated + opted-in�document
  • Capture turns off on cross-origin�navigation�
  • Preferential placement in picker →
  • 🍬 APIs for capturercapturee�comms (e.g. next/prev slide, id)�become available in safe garden

32 of 37

#182 Safer presentations even in getDisplayMedia

What does the spec need to facilitate this new choice in User Agents?�

  • Editorial: Name this new source as a concept. “web page”/site/app?
  • Loosen Elevated Permission language in spec for these sources
  • Encourage preferential placement of them vs tabs, w/SHOULD
  • Specify requirements on sites to qualify as this new source:
    • window.crossOriginIsolated +
      • Document-Policy: html-capture
      • Require-Document-Policy: html-capture
  • Editorial: Name these site requirements as a concept (“opted into html capture”?), bc may be same requirements as getViewportMedia

33 of 37

#158 crop MediaStream from the share-this-tab API

Proposal from Youenn:

navigator.mediaDevices.getTabViewportMedia() // tab viewport

document.getViewportMedia() // captures document viewport

iframe.getViewportMedia() // captures iframe viewport

All 3 call the same underlying algorithm taking a viewport as internal parameter,�which is responsible of permission policy, prompting, creation of track.

With this approach, there is no default option to select in the spec and no surprise from web developers on what they will get.

Easy feature detection of gradual support. Follow-up Q: Permissions policy?�Has site-isolation & capture opt-in been resolved?

34 of 37

Discussion (11:15-11:30)

  • Questions?
  • Opinions?

34

35 of 37

Wrap-up and Next Steps (11:30-12:00)

35

36 of 37

For extra credit

36

Name the bird!

37 of 37

Thank you

Special thanks to:

WG Participants, Editors & Chairs

The mammal

37