1 of 42

W3C WebRTC

WG Meeting

March 15, 2022

8 AM - 10 AM

1

Chairs: Bernard Aboba

Harald Alvestrand

Jan-Ivar Bruaroey

2 of 42

W3C WG IPR Policy

This group abides by the W3C Patent Policy�https://www.w3.org/Consortium/Patent-Policy/
Only people and companies listed at https://www.w3.org/2004/01/pp-impl/47318/status are allowed to make substantive contributions to the WebRTC specs

2

3 of 42

Welcome!

Welcome to the March 2022 interim meeting of the W3C WebRTC WG, at which we will cover:

WebRTC-SVC
WebRTC-Extensions
Avoiding the “Hall of Mirrors”
getViewportMedia
MediaCapture-Extensions

Future meetings:

April 19
May 17

3

4 of 42

About this Virtual Meeting

Meeting info:

https://www.w3.org/2011/04/webrtc/wiki/March_15_2022

Link to latest drafts:

Link to Slides has been published on WG wiki
Scribe? IRC http://irc.w3.org/ Channel: #webrtc
The meeting is being recorded. The recording will be public.
Volunteers for note taking?

4

5 of 42

W3C Code of Conduct

This meeting operates under W3C Code of Ethics and Professional Conduct�
We're all passionate about improving WebRTC and the Web, but let's all keep the conversations cordial and professional

5

6 of 42

Virtual Interim Meeting Tips

This session is (still) being recorded

Type +q and -q in the Google Meet chat to get into and out of the speaker queue.
Please use headphones when speaking to avoid echo.
Please wait for microphone access to be granted before speaking.
Please state your full name before speaking.
Poll mechanism may be used to gauge the “sense of the room”.

6

7 of 42

Understanding Document Status

Hosting within the W3C repo does not imply adoption by the WG.

WG adoption requires a Call for Adoption (CfA) on the mailing list.

Editor’s drafts do not represent WG consensus.

WG drafts do imply consensus, once they’re confirmed by a Call for Consensus (CfC) on the mailing list.
It is possible to merge PRs that may lack consensus, if a note is attached indicating controversy.

7

8 of 42

Poll about TPAC 2022

Are you considering attending in person TPAC 2022 (week of Sep 12 2022 in Vancouver)?

Yes
No
Don’t know

8

9 of 42

Issues for Discussion Today

08:10 - 08:30 AM (WebRTC-SVC and WebRTC-Extensions, Bernard)

Slides (08:10 - 08:20)
Discussion (08:20 - 08:30)

08:30 - 09:00 AM (Avoiding the “Hall of Mirrors”, Elad)

Slides (08:30 - 08:45)
Discussion (08:45 - 09:00)

09:00 - 09:20 AM (Display Surface Hints, getViewportMedia, Elad + Jan-Ivar)
09:20 - 09:50 AM: (MediaCapture-Extensions, Riju)

Slides (9:20 - 9:35 AM)
Discussion (9:35 - 9:50)

09:50 AM - 10:00 AM Wrap-up and Next Steps

Time control:

A warning will be given 2 minutes before time is up.
Once time has elapsed we will move on to the next item.

9

10 of 42

WebRTC-SVC &

WebRTC-Extensions (Bernard)

Start Time: 8:10 AM

End Time: 8:30 AM

10

11 of 42

Issues/PRs for Discussion

WebRTC-SVC

Issue 68/PR 69: Clarify behavior of getParameters()

WebRTC-Extensions

Issue 98: Disabling hardware acceleration
Issue 99: Should RTCRtpHeaderExtensionCapabilities offer an “enabled” member?

11

12 of 42

Issue 68: Clarify behavior of getParameters()

Section 4.2.3 is unclear about re-negotiation:

Before negotiation has completed, getParameters() returns the scalabilityMode value for each encoding in encodings, assuming it was successfully set by addTransceiver() or setParameters(). If no scalabilityMode value was provided for an encoding in encodings, or if a value was not successfully set, then getParameters() will not return a scalabilityMode value for that encoding.

After negotiation has completed, getParameters() returns the currently configured scalabilityMode value for each encoding in encodings. This may be different from the values requested in addTransceiver() or setParameters(). If the configuration is not satisfactory, setParameters() can be used to change it.

If an encoding in encodings had no scalabilityMode value provided to addTransceiver() or setParameters(), getParameters() returns the default scalabilityMode of the most preferred codec. The most preferred codec and the default scalabilityMode for each ecodec are both implementation dependent. The default scalabilityMode SHOULD be one of the temporal scalability modes (e.g. "L1T1","L1T2","L1T3", etc.).

12

13 of 42

PR 69: Clarify behavior of getParameters()

Proposed language:

Before the initial negotiation has completed, getParameters() returns the scalabilityMode value for each encoding in encodings, as last set by addTransceiver() or setParameters(). If no scalabilityMode value was provided for an encoding in encodings, or if a value was not successfully set, then getParameters() will not return a scalabilityMode value for that encoding.

After the initial negotiation has completed, getParameters() returns the currently configured scalabilityMode value for each encoding in encodings. This may be different from the values requested in addTransceiver() or setParameters(). If the configuration is not satisfactory, setParameters() can be used to change it.

If addTransceiver() or setParameters() did not provide a scalabilityMode value for an encoding in encodings, then after the initial negotiation has completed, getParameters() returns the default scalabilityMode of the most preferred codec for that encoding. The most preferred codec and the default scalabilityMode for each ecodec are both implementation dependent. The default scalabilityMode SHOULD be one of the temporal scalability modes (e.g. "L1T1","L1T2","L1T3", etc.).

13

14 of 42

Issue 98: Disabling hardware acceleration

Fippo has provided a long list of hardware-acceleration implementation bugs. A sample:

Hardware H264 Encoder is being used instead of OpenH264 on MacOS resulting in poor frame rate / pixelated video: https://bugs.chromium.org/p/chromium/issues/detail?id=1142273&
Resolution cannot ramp-up to 720P on Windows Chrome 91 with HW acceleration enabled: https://bugs.chromium.org/p/webrtc/issues/detail?id=12942
VP8 encoder issue when hardware acceleration enabled on mobile device: https://bugs.chromium.org/p/chromium/issues/detail?id=1237677
Windows H264 encoder may freeze: https://bugs.chromium.org/p/chromium/issues/detail?id=1252710

Can we provide a way to disable hardware-acceleration?

WebCodecs has VideoEncoderConfig.hardwareAcceleration:

14

15 of 42

Issue 98: Potential Approaches

RTCRtpSender.setParameters()

Limitation: Cannot change the envelope negotiated by Offer/Answer

Can’t disable hardware acceleration for a hardware-only codec (or software for a software-only codec).

Is it necessary to be able to switch mid-stream?

RTCRtpTransceiver.setCodecPreferences()

Extend RTCRtpCodecCapability dictionary:

partial dictionary RTCRtpCodecCapability {

HardwareAcceleration hardwareAcceleration = "no-preference";

};

Influences codec/profile combinations in createOffer/createAnswer.

A hardware-only codec/profile combination would not surface in createOffer/createAnswer if hardwareAcceleration was set to “prefer-software”.

15

16 of 42

Issue 98: Potential Approaches (cont’d)

RTCRtpCodecCapability discovery in Media Capabilities API

Issue 185: Retrieving RTCRtpCodecCapability from MediaCapabilities when queried for webrtc

Should a hardwareAcceleration member be returned to indicate software-only or hardware-only codecs?
Or is smooth/power efficient/supported enough?

16

17 of 42

Issue 99: Should RTCRtpHeaderExtensionCapabilities offer an “enabled” member? (Harald)

Scenario: Implementation supports snazzy-extension

By default, it is not set in offers
It is listed in Capabilities
But doesn’t turn up on offers
You can’t see why

Is this a problem?

If yes - make info visibie
If no (we can always inspect the offer) - no change

17

18 of 42

Discussion (End Time: 8:30 AM)

18

19 of 42

Avoiding the “Hall of Mirrors” (Elad)

Start Time: 08:30 AM

End Time: 09:00 AM

19

20 of 42

Hall of Mirrors - Reminder (Elad)

The “Hall of Mirrors” effect can be observed when an application captures a surface, then draws it back to the area being captured.

20

21 of 42

Hall of Mirrors - Problem (Elad)

Users can accidentally trigger one of three variants of the HoM effect:

Capturing the current-tab.
Capturing the current-window.
Capturing the current-screen.

Many video-conferencing applications display a preview of the captured content back to the user, triggering the HoM.

Adverse effects include:

Confuses local user.
Confuses remote users.
Could potentially produce mic-howl.

21

22 of 42

Hall of Mirrors - Rejected “Solution” (Elad)

Rejected: “Maybe the user agent should just exclude the current tab from the list of available tabs?”

Note that this (rejected) solution exclusively focuses on tab-capture.

This would break legitimate applications which intentionally record the current tab, for example recording gameplay or video demonstrations. Such applications would normally not show the local user a preview, which means they don’t have a problem with HoM.

22

23 of 42

Hall of Mirrors - Suggested Solution (Elad)

Recall that getDisplayMedia() receives the following dictionary.

We could extend it:

dictionary DisplayMediaStreamConstraints {

(boolean or MediaTrackConstraints) video = true;

(boolean or MediaTrackConstraints) audio = false;

boolean excludeCurrentTab;

};

The default behavior remains unchanged (in all browsers currently supporting tab-capture). But applications can trigger the new behavior, which avoids HoM by denying the user the possibility of selecting the current tab.

23

24 of 42

Hall of Mirrors - Security Discussion (Elad)

“That influences user choice! It could be abused for social engineering!”

We have discussed (often) the dangers of allowing an application to push the user towards either:

A display surface under the application’s control (e.g. current tab).
A display surface that is inherently dangerous (e.g. a screen).

The suggested solution does neither. There is no degradation in security.

24

25 of 42

Hall of Mirrors - Default Value (Elad)

Three options for a default value:

Avoiding breaking current applications. (excludeCurrentTab: false)
Get us where we want to end up. (Open for discussion where that is…)
Don’t specify a default value.

I suggest we go with #3. This is a hint. It may be provided or omitted, and the user agent MAY regard or ignore it.

If a value for includeCurrentTab is specified, it is a hint. The user agent MAY regard this hint when deciding whether to include the current tab in the list of surfaces it offers to the user.

25

26 of 42

Hall of Mirrors - Potential Scope Expansion (Elad)

It is possible to go beyond tab-capture.

Three distinct knobs:

includeCurrentTab
includeCurrentWindow
includeCurrentMonitor

Cumulative knob:

includeCurrentTab
includeCurrentTabAndWindow
includeCurrentTabAndWindowAndMonitor

The security argument made in an earlier slide still holds, albeit less patently.

26

27 of 42

Discussion (End Time: 09:00 AM)

27

28 of 42

Display Surface Hints (Elad)

getViewportMedia (Jan-Ivar)

Start Time: 09:00 AM

End Time: 09:20 AM

28

29 of 42

Display Surface Hinting - Let’s Resolve (Elad)

We have been discussing displaySurface-hints for a long. The latest manigestation of this discussion is issue #184, which has been under discussion for 9 months by now.

This is a highly-requested control knob. We should be able to accommodate Web-developers in a timely manner with such small changes. We should converge on the least controversial proposal.

Keep using established mechanisms - constraints.
Leave the ultimate decision to the user agent. Keep it a hint.

User agents MAY change the order or prominence of offered choices in response to an application's preference, as indicated by the {{displaySurface}} constraint.

Let’s ship it.

29

30 of 42

getViewportMedia() update: ready for Call for Adoption

Document up at https://w3c.github.io/mediacapture-viewport/ (UD)

Recap: (resolutions from April & September)

Captures top-level browsing context’s viewport (current tab) even from an iframe
Gated by

window.crossOriginIsolated
“Document-Policy: viewport-capture” (opt-in) +�“Require-Document-Policy: viewport-capture” (#4)
User Permission ”viewport-capture”
Permissions policy ”viewport-capture” (in iframes)
transient activation

Same privacy indicator requirements and constraints (video+audio) as getDisplayMedia

31 of 42

Discussion (End Time: 09:20 AM)

31

32 of 42

MediaCapture-Extensions

Start Time: 09:20 AM

End Time: 09:50 AM

32

33 of 42

PRs for Discussion

PR 48: Face Detection
PR 49: Background Concealment Blur
PR 57: Face detection, background blur and eye gaze correction example
PR 53: Lighting Correction
PR 55: Face Framing
PR 56: Eye gaze correction

33

34 of 42

PR 48: Face detection

34

partial interface VideoFrame {

readonly attribute FrozenArray<DetectedFace>? detectedFaces;

};

dictionary DetectedFace {

required long id;

required float probability;

FrozenArray<Point2D> contour;

FrozenArray<Point2D> mesh;

FrozenArray<DetectedFaceLandmark> landmarks;

};

dictionary DetectedFaceLandmark {

required FrozenArray<Point2D> contour;

FaceLandmark type;

};

enum FaceLandmark {

"eye", "eyeLeft", "eyeRight", "mouth","nose"};

partial dictionary MediaTrackSupportedConstraints {

boolean faceDetectionMode = true;

boolean faceDetectionLandmarks = true;

boolean faceDetectionMaxNumFaces = true;

boolean faceDetectionNumContourPoints = true;

boolean faceDetectionNumLandmarkPoints = true;

};

enum FaceDetectionMode {

"no ne", "presence", "contour", "mesh"};

35 of 42

Face detection example

35

// Check if face detection is supported by the browser

const supports = navigator.mediaDevices.getSupportedConstraints();

if (supports.faceDetectionMode &&

supports.faceDetectionNumContourPoints) {

// Browser supports face contour detection.

} else {

throw('Face contour detection is not supported');

}

// Open camera with face detection enabled

const stream = await navigator.mediaDevices.getUserMedia({

video: {

faceDetectionMode: 'contour',

faceDetectionNumContourPoints: {exact: 4}

}

});

const [videoTrack] = stream.getVideoTracks();

// Use a video worker and show to user.

const videoElement = document.querySelector('video');

const videoWorker = new Worker('video-worker.js');

videoWorker.postMessage({track: videoTrack}, [videoTrack]);

const {data} = await new Promise(r => videoWorker.onmessage);

videoElement.srcObject = new MediaStream([data.videoTrack]);

// video-worker.js:

self.onmessage = async ({data: {track}}) => {

const generator = new VideoTrackGenerator();

parent.postMessage({videoTrack: generator.track}, [generator.track]);

const {readable} = new MediaStreamTrackProcessor({track});

const transformer = new TransformStream({

async transform(frame, controller) {

for (const face of frame.detectedFaces) {

console.log(

`Face @ (${face.contour[0].x}, ${face.contour[0].y}), ` +

`(${face.contour[1].x}, ${face.contour[1].y}), ` +

`(${face.contour[2].x}, ${face.contour[2].y}), ` +

`(${face.contour[3].x}, ${face.contour[3].y})`);

controller.enqueue(frame);

}

});

await readable.pipeThrough(transformer).pipeTo(generator.writable);

};

36 of 42

PR 49: Background Concealment Blur

36

partial dictionary MediaTrackSupportedConstraints {

boolean backgroundBlur = true;

};

partial dictionary MediaTrackCapabilities {

MediaSettingsRange backgroundBlur;

};

partial dictionary MediaTrackConstraintSet {

MediaSettingsRange backgroundBlur;

};

partial dictionary MediaTrackSettings {

double backgroundBlur;

};

const stream = await

navigator.mediaDevices.getUserMedia({video: true});

const [videoTrack] = stream.getVideoTracks();

// Try to conceal background.

const videoCapabilities = videoTrack.getCapabilities();

if (videoCapabilities.backgroundBlur) {

await videoTrack.applyConstraints({

advanced: [{backgroundBlur: videoCapabilities.backgroundBlur.max}]

});

} else {

// Background concealment is not supported by the platform or by the camera.

// Consider falling back to some other method.

}

// Show to user.

const videoElement = document.querySelector("video");

videoElement.srcObject = stream;

value of 0.0 indicates no background blur and increasing values indicate increasing background blur

37 of 42

PR 57: Face detection, BG blur, etc example

37

const generator = new VideoTrackGenerator();

parent.postMessage({videoTrack: generator.track}, [generator.track]);

const {readable} = new MediaStreamTrackProcessor({track});

const transformer = new TransformStream({

async transform(frame, controller) {

// Detect faces or retrieve detected faces.

const detectedFaces =

customFaceDetection

? await detectFaces(frame)

: frame.detectedFaces;

// Blur the background if needed.

if (customBackgroundBlur) {

const newFrame = await blurBackground(frame, detectedFaces);

frame.close();

frame = newFrame;

}

// Correct the eye gaze if needed.

if (customEyeGazeCorrection && (detectedFaces || []).length > 0) {

const newFrame = await correctEyeGaze(frame, detectedFaces);

frame.close();

frame = newFrame;

}

controller.enqueue(frame);

}

});

await readable.pipeThrough(transformer).pipeTo(generator.writable);

};

LINK

38 of 42

PR 53: Lighting Correction

38

partial dictionary MediaTrackSupportedConstraints {

boolean lightingCorrection = true;

};

partial dictionary MediaTrackCapabilities {

sequence<boolean> lightingCorrection;

};

partial dictionary MediaTrackConstraintSet {

ConstrainBoolean lightingCorrection;

};

partial dictionary MediaTrackSettings {

boolean lightingCorrection;

};

Lighting correction is a boolean setting controlling whether face and background lighting balance is to be corrected.

39 of 42

PR 55: Face Framing

39

partial dictionary MediaTrackSupportedConstraints {

boolean faceFraming = true;

};

partial dictionary MediaTrackCapabilities {

sequence<boolean> faceFraming;

};

partial dictionary MediaTrackConstraintSet {

ConstrainBoolean faceFraming;

};

partial dictionary MediaTrackSettings {

boolean faceFraming;

};

Face framing is a boolean setting controlling whether framing is to be improved by cropping to faces.

40 of 42

PR 56: Eye gaze correction

40

partial dictionary MediaTrackSupportedConstraints {

boolean eyeGazeCorrection = true;

};

partial dictionary MediaTrackCapabilities {

sequence<boolean> eyeGazeCorrection;

};

partial dictionary MediaTrackConstraintSet {

ConstrainBoolean eyeGazeCorrection;

};

partial dictionary MediaTrackSettings {

boolean eyeGazeCorrection;

};

Eye gaze correction is a boolean setting controlling whether the eye gaze is to be corrected.

41 of 42

Discussion (End Time: 09:50 AM)

41

42 of 42

Thank you

Special thanks to:

WG Participants, Editors & Chairs

42