Screen Capture Community Group�2023-06-26
Agenda
Introduction
Element Capture Follow-up
Jordan Bayles
Google�jophba@google.com
Mark Foltz
Google�mfoltz@google.com
What is Element Capture?
Recall region capture: it allows a video track captured from a tab to be cropped according to the bounds of some element on the page.
Proprietary + Confidential
Region Capture
Region Capture is a three step process:
1. Captured document creates a cropTarget for the content of interest.
const cropTarget =
await CropTarget.fromElement(mainContentArea);
Proprietary + Confidential
Region Capture
2. Application captures the tab with the embedded cropTarget.
const stream = await
navigator.mediaDevices.getDisplayMedia({
preferCurrentTab: true,
});
const [track] = stream.getVideoTracks();
Proprietary + Confidential
Region Capture
3. Application crops the track with the cropTarget, which will capture only the main content area.
await track.cropTo(cropTarget);
Proprietary + Confidential
What is Element Capture?
Proprietary + Confidential
Region Element Capture
3. Application crops the track with the cropTarget, which will capture only the main content area without any occluding content.
await track.restrictTo(cropTarget);
Proprietary + Confidential
Element Capture - What can be captured?
Element should form a "Stacking Context" - a new context to resolve z-index values
<div class="one">
<div class="child"
style="z-index:-1"></div>
</div>
one
child
<div class="one"
style="isolation:isolate">
<div class="child"
style="z-index:-1"></div>
</div>
Proprietary + Confidential
Element Capture - What can be captured?
Element should form a "Backdrop Root" - ancestor elements can't mask or filter it
Proprietary + Confidential
Element Capture - What can be captured?
Element ancestor should not use 3D transforms
Proprietary + Confidential
Element Capture - Open questions
Proprietary + Confidential
Element Capture - Next Steps
Proprietary + Confidential
Element Capture - Possible future work
Proprietary + Confidential
Remote-Control API
Elad Alon
eladalon@google.com
Problem description
A user is in a video call and shares a tab.
How does the user…
If the user focused the captured tab…
Other solutions
Let’s briefly explore alternative solutions and see why we might wish to explore yet more possibilities, such as that proposed in later slides.
Proposed new solution
After a permissions prompt, allow the capturing application limited control over the captured surface.
Before we delve into the exact shape, let’s have a quick run through an example.
Sample usage 1:
Initiate capture and obtain permission
const controller = new CaptureController();
const stream =
await navigator.mediaDevices.getDisplayMedia({ controller });
const video = document.getElementById('myVideoElement');
video.srcObject = stream;
// Perform a null-action so as to prompt the user for permission.
try {
await controller.sendMouseWheel({});
} catch (e) {
return; // Permission denied. Bail.
}
Permission prompts are only displayed when attempting to use the API they gate access to.�A null event is intentionally used to avoid performing an action before the user requests one.
Sample usage 2:
Relay scroll events to captured surface
// Having obtained the user’s permission, we can now relay subsequent
// wheel events to the captured tab.
video.addEventListener("wheel", event => {
const [x, y] = translateCoordinates(event.offsetX, event.offsetY);
controller.sendMouseWheel({
x,
y,
wheelDeltaX: event.wheelDeltaX,
wheelDeltaY: event.wheelDeltaY});
});
translateCoordinates() scales the coordinates in the video-element to those of the captured surface. Its implementation is left as an exercise for the reader.
Sample usage 3:
Control zoom-level of captured tabs
const zoomInButton = document.getElementById('zoomInButton');
zoomInButton.addEventListener('click', async (event) => {
const oldZoomLevel =
await controller.getZoomLevel();
const newZoomLevel =
Math.min(oldZoomLevel + 10,
controller.getMaxZoomLevel());
controller.setZoomLevel(newZoomLevel);
});
Possible capturing-app UX
Effect in captured tab
Proposed API shape
dictionary CapturedMouseWheelAction {
int x = 0;
int y = 0;
int wheelDeltaX = 0;
int wheelDeltaY = 0;
};
partial interface CaptureController {
// 0. Pre-existing and irrelevant methods omitted.
...
// 1. Scrolling
Promise<undefined> sendMouseWheel(CapturedMouseWheelAction action);
// 2. Zoom-level
int getMinZoomLevel();
int getMaxZoomLevel();
Promise<int> getZoomLevel();
Promise<undefined> setZoomLevel(int zoomLevel);
};
Finer points 1: Permission prompt
Finer points 2a: Scrolling and paging
Finer points 2b: MouseWheel vs. Paging
Excluding PiP from screen-capture
Arnaud Budkiewicz
Dialpad
Problem description
Picture-in-Picture is a great feature that keeps on the top of the screen
Problem description
In all these cases, as soon as the user is sharing screen, the PiP window is shared too, as it is part of what is on the screen.
Problem description
When the user is sharing a tab, a window, the PiP window is NOT shared, but sharing a screen exposes it IMMEDIATELY.
Problem description
The size of the PiP window is initially small, hiding a minimal fraction of what is underneath, but can be resized up to almost the size of the entire screen.
Problem description
Removing the PiP window from what getDisplayMedia is sharing with the far end could potentially expose unwanted content.
Proposal
One more thing…
While in a meeting, the state of mic and camera are very important information that should be visible at all times, but while in a meeting using PiP, as soon as you move the cursor away from the PiP window, the icons disappear, and the user doesn't know the state of the mic and camera.
As a user, I'd rather have the 3 icons visible at all times.�https://bugs.chromium.org/p/chromium/issues/detail?id=1442389
As a developer, it could be a parameter that would change the default behavior:
CapturedMouseEvent listener addition after getDisplayMedia()
Frédéric Wang
Igalia
fwang@igalia.com
Quick recap
let controller = new CaptureController();
controller.oncapturedmousechange = (event) => {
console.log(`surfaceX=${event.surfaceX}, surfaceY=${event.surfaceY}`);
};
let stream = await navigator.mediaDevices.getDisplayMedia({
controller: controller
});
👉🏼 to be proposed at the WebRTC WG tomorrow
controller.addEventListener("capturedmousechange", (event) => { ... });
Quick recap
Problem description
let controller = new CaptureController();
controller.oncapturedmousechange = (event) => { ... };
let stream = await navigator.mediaDevices.getDisplayMedia({
controller: controller
});
controller.oncapturedmousechange = (event) => { ... };
controller.addEventListener(“capturedmousechange”, ...);
Problem description
Alternatives
👉🏼 Implementations can postpone establishing communication channel until the first event handler registered.
👉🏼 Implementations can also stop sending events through the communication channel when all event handlers are removed.
👉🏼 Can do something similar to (1) when all event handlers are removed.
Dynamically switching between sources
Elad Alon
eladalon@google.com
Dynamic-switching
Chrome (any platform)
Safari (macOS)
Challenge
Can we keep extending UA capabilities without new spec changes?
Possibly, but some undesirable results would follow.
Examine the example of MediaStreamTrack.cropTo():
This is a general problem.
The user changing the target is an asynchronous event, outside the control of the app, and not easily observable by the app. And it can mean the difference between:
Alternative (with known issues)
Currently, dynamic-switching involves switching out the source.
But what if it didn’t? What if instead it:
Then:
const controller = new CaptureController();
controller.addEventListener('switch', (event) => {
const videoElement = document.getElementById('myVideoElement');
videoElement.srcObject = event.mediaStream;
});
…�navigator.mediaDevices.getDisplayMedia({ controller });
Proposal analysis
Cross-surface switching based on event-handler
Should we specify that the browser only allows cross-surface switching if the app registers an event handler to handle the new stream?
Administrative matters
Image by Sear Greyson.
Administrative matters
Until next time!
Image by Josh Nezon.