Joint MEDIA & WebRTC
WG Meeting
September 15, 2023
14:30 - 16:30 Seville time
5:30 - 7:30 Pacific Time
12:30 - 14:30 UTC
1
W3C WG IPR Policy
2
W3C Code of Conduct
3
Safety Reminders
While attending TPAC, follow the health rules:
Please be aware of and respect the personal boundaries of your fellow participants
4
Welcome!
5
About TPAC 2023 Meetings
6
Virtual Meeting Tips (Zoom)
7
Agenda
Time control:
8
Introduction
Start Time: 14:40
End Time: 14:55
9
Background
10
Examples of Similar Issues
11
Is there a better way forward?
12
WebCodecs-WebTransport Echo
13
Example
14
Configuration: {"alpha":"discard","bitrate":1000000,"bitrateMode":"variable","codec":"av01.0.08M.10.0.110.09","framerate":30,"hardwareAcceleration":"no-preference","height":1080,"latencyMode":"realtime","scalabilityMode":"L1T3","width":1920}
Keyframe
Glass-Glass Latency
15
Encoding Latency
16
Decoding Latency
17
Discussion (End Time: 14:55)
18
QP-based Rate Control (Eugene)
Start Time: 14:55
End Time: 15:10
19
Frame QP-based Rate Control in WebCodecs
20
enum VideoEncoderBitrateMode {
};
constant
Encode at a constant bitrate. See bitrate.
variable
Encode using a variable bitrate, allowing more space to be used for complex signals and less space for less complex signals. See bitrate.
quantizer
Encode using a quantizer, that is specified for each video frame in codec specific extensions of VideoEncoderEncodeOptions.
VideoEncoderEncodeOptionsForAv1
dictionary VideoEncoderEncodeOptionsForAv1 {
};
quantizer, of type unsigned short, nullable
Sets per-frame quantizer value.
QP-based Rate Control Demo
21
Primitive QP adjustment algorithm
const frames_to_consider = 4;
const frame_budget_bytes = (this.bitrate / this.fps) / 8;
const impact_ratio = [1.0 / 8, 1.0 / 8, 1.0 / 4, 1.0 / 2];
let chunks = this.chunks.slice(-frames_to_consider);
let normalized_chunk_size = 0;
for (let i = 0; i < frames_to_consider; i++)
normalized_chunk_size += chunks[i].byteLength * impact_ratio[i];
const diff_bytes = normalized_chunk_size - frame_budget_bytes;
const diff_ratio = diff_bytes / frame_budget_bytes;
let qp_change = 0;
// Addressing overshoot more aggressively that undershoot
// Don't change QP too much when it's already low, because it
// changes chunk size to drastically.
if (diff_ratio > 0.6 && qp > 15) {
qp_change = 3;
} else if (diff_ratio > 0.25 && qp > 5) {
qp_change = 2;
} else if (diff_ratio > 0.04) {
// Overshoot by more than 4%
qp_change = 1;
} else if (diff_ratio < (qp < 10 ? -0.10 : -0.04)) {
// Undershoot by more than 4% (or 10% if QP is already low)
qp_change = -1;
}
new_qp = this.clamp_qp(qp + qp_change);
22
For small QP values always increase only by 1.
Discussion (End Time: 15:10)
23
Hardware Encode/Decode Error Handling
(Bernard & Fippo)
Start Time: 15:10
End Time: 15:25
24
For Discussion Today
25
Issue 146: Exposing decode errors / SW fallback as an event
26
Issue 146: Exposing decode errors / SW fallback as an event
27
Issue 146: Exposing decode errors (cont’d)
28
Issue 146: Exposing decode errors (cont’d)
29
Issue 146: Exposing decode errors (cont’d)
[Exposed=Window]
interface RTCRtpSenderErrorEvent : Event {
constructor(DOMString type, RTCRtpSenderErrorEventInit eventInitDict);
readonly attribute DOMString? rid;
readonly attribute unsigned short errorCode;
readonly attribute USVString errorText;
readonly attribute long long timestamp;
};
[Exposed=Window]
interface RTCRtpReceiverErrorEvent : Event {
constructor(DOMString type, RTCRtpReceiverErrorEventInit eventInitDict);
readonly attribute unsigned short errorCode;
readonly attribute USVString errorText;
readonly attribute long long timestamp;
};
30
Issue 669: Custom Error Types
31
Discussion (End Time: 15:25)
32
New Video Encoder API (Erik)
Start Time: 15:25
End Time: 16:25
33
Current State
34
Chrome
PeerConnection
WebCodecs
WebRTC
VideoEncoder
Adapter
Adapter
Adapter
Adapter
Adapter
Wrapper
libvpx
scalability
scalability
Adapter
scalability
Adapter
Adapter
Wrapper
Adapter
vaapi
scalability
Wrapper
libvpx
Background
The Meet Video Quality team is planning a major overhaul of the internal WebRTC video encoder API. Our goals include:
35
Background
Let’s use this opportunity to align WebCodecs and WebRTC!
tl;dr:
36
Future Vision
37
Hooli Greet
Google Meet
Chrome
PeerConnection
WebCodecs
WebRTC
Adapter
Adapter
Wrapper
vaapi
Scalability Controller
Wrapper
libvpx
Scalability Controller
WebCodecs
WebCodecs
New Video Encoder API
Focus Areas
38
Codec selection & prioritization
39
Codec selection & prioritization
Solution: Let the app be in full control!
This implies that:
40
Flexible reference structures
Scalability modes - there’s a lot of them!
Yet not enough…
41
Flexible reference structures
Many structures are cumbersome or impossible to implement with scalability modes:
Quickly becomes infeasible to implement this for every encoder.
42
Flexible reference structures
Solution: Let the app be in full control!
Model the encoder as a set of reference buffers.
Let the user specify encode settings per output frame:
43
Flexible reference structures
Let the app be in full control!
Benefits include (but not limited to):
44
Rate Control
Many hardware accelerated codecs have poor rate control - at least where RTC is concerned.
Solution: Let the app be in full control!
Using external rate control, adapting bitrate use frame QP
Improvements and bug fixes can reside in the app - no need to wait for driver updates.
45
Draft API Proposal
46
For illustrative purposes (mostly)!
1: Encoder Capabilities
How we specify the characteristics of an encoder
47
Draft API - Capabilities
48
enum RateControlModes { kCQP, kCBR }
class BitrateControl {
frameDropping: boolean;
qpRange: [number, number];
supportedModes: set<RateControlModes>;
}
class EncoderCapabilities {
bitrateControl: BitrateControl;
predictionConstraints: PredictionConstraints;
inputConstraints: InputConstraints;
encodingFormats: EncodingFormat[];
performance: Performance;
}
Draft API - Capabilities
49
class PredictionConstraints {
totalBuffers: number;
maxReferences: number;
maxTemporalLayers = 1;
maxSpatialLayers = 1;
scalingFactors: set<number> = [1];
sharedBufferSpace = false;
}
class EncoderCapabilities {
bitrateControl: BitrateControl;
predictionConstraints: PredictionConstraints;
inputConstraints: InputConstraints;
encodingFormats: EncodingFormat[];
performance: Performance;
}
Draft API - Capabilities
50
enum PixelFormat { kI420, kNv12, kP016_LE, … }
class InputConstraints {
min: Resolution;
max: Resolution;
pixelAlignment: number;
inputFormats: PixelFormat[];
}
class EncoderCapabilities {
bitrateControl: BitrateControl;
predictionConstraints: PredictionConstraints;
inputConstraints: InputConstraints;
encodingFormats: EncodingFormat[];
performance: Performance;
}
Draft API - Capabilities
51
enum SubSampling { k420, k422, k444 }
class EncodingFormat {
subSampling: SubSampling;
bitdepth: number;
}
class EncoderCapabilities {
bitrateControl: BitrateControl;
predictionConstraints: PredictionConstraints;
inputConstraints: InputConstraints;
encodingFormats: EncodingFormat[];
performance: Performance;
}
Draft API - Capabilities
52
class Performance {
maxEncodedPixelsPerSeconds?: number;
minMaxEffortLevel: [number, number];
}
class EncoderCapabilities {
bitrateControl: BitrateControl;
predictionConstraints: PredictionConstraints;
inputConstraints: InputConstraints;
encodingFormats: EncodingFormat[];
performance: Performance;
}
2: Encoder Creation
53
Draft API - Create encoder
54
function enumerateVideoEncoders(): VideoEncoderFactory[]
interface VideoEncoderFactory {
getCapabilities(): EncoderCapabilities;
getImplementationName(): string;
getCodecName(): string;
getCodecSpecifics(): map<string, string>;
createVideoEncoder(settings: EncoderSettings): VideoEncoder;
}
Draft API - Create encoder
55
class EncoderSettings {
maxNumberOfThreads: number;
maxEncodeDimensions: Resolution;
encodingFormat: EncodingFormat;
rcMode: RateControlModes;
codecSpecifics: map<string, string>;
}
3: Encoding Frames
56
Frame Encode Settings
Frame Encode Settings
Frame Encode Settings
encode()
Input Frame
Temporal Unit Settings
Encoded Frame
Encoded Frame
Encoded Frame
Callback
Draft API - Encode frames
57
enum ContentHint { kMotion, kDetail }
enum FrameDroppingMode { kOff, kAnyLayer, kAllLayers }
class TemporalUnitSettings {
contentHint: ContentHint;
effortLevel: number;
frameDroppingMode: FrameDroppingMode;
}
interface VideoEncoder {
encode(
inputFrame: VideoFrame,
settings: TemporalUnitSettings,
frameEncodeSettings: LayerSettings[],
callback: EncodeResultCallback): bool;
}
Draft API - Encode frames
58
class CbrParams {
durationUs: number;
dataRateBps: number;
}
class CqpParams {
targetQp: number;
}
class LayerSettings {
rcOption: CbrParams|CqpParams;
...
}
interface VideoEncoder {
encode(
inputFrame: VideoFrame,
settings: TemporalUnitSettings,
frameEncodeSettings: LayerSettings[],
callback: EncodeResultCallback): bool;
}
Draft API - Encode frames
59
class LayerSettings {
rcOption: CbrParams|CqpParams;
forceKeyframe: boolean;
temporalId: number;
spatialId: number;
resolution: Resolution;
referenceBuffers: set<number>;
updateBuffers: set<number>;
}
interface VideoEncoder {
encode(
inputFrame: VideoFrame,
settings: TemporalUnitSettings,
frameEncodeSettings: LayerSettings[],
callback: EncodeResultCallback): bool;
}
Draft API - Encode frames
60
enum DropReason { kDropped, kError }
class DroppedFrame {
reason: DropReason;
spatialId: number;
}
class EncodedData {
bitstreamData: EncodedVideoChunk;
isKeyframe: boolean;
spatialId: number;
referencedBuffers: set<number>;
}
type EncodeResultCallback = (result: EncodedData|DroppedFrame) => void;
interface VideoEncoder {
encode(
inputFrame: VideoFrame,
settings: TemporalUnitSettings,
frameEncodeSettings: LayerSettings[],
callback: EncodeResultCallback): bool;
}
Example: Implementing L2T2_KEY
encoder.encode(
inputFrame, encoderSettings,
frameEncodeSettings: [{
forceKeyframe: true,
temporalId: 0,
spatialId: 0,
resolution: {320, 180},
referenceBuffers: [],
updateBuffers: [] // Implicitly all.
}, {
forceKeyframe: false,
temporalId: 0,
spatialId: 1,
resolution: {640, 360},
referenceBuffers: [0],
updateBuffers: [1]
}],
callback);
61
Example: Implementing L2T2_KEY
encoder.encode(
inputFrame, encoderSettings,
frameEncodeSettings: [{
forceKeyframe: false,
temporalId: 1,
spatialId: 0,
resolution: {320, 180},
referenceBuffers: [0],
updateBuffers: []
}, {
forceKeyframe: false,
temporalId: 1,
spatialId: 1,
resolution: {640, 360},
referenceBuffers: [1],
updateBuffers: []
}],
callback);
62
Example: Implementing L2T2_KEY
encoder.encode(
inputFrame, encoderSettings,
frameEncodeSettings: [{
forceKeyframe: false,
temporalId: 0,
spatialId: 0,
resolution: {320, 180},
referenceBuffers: [0],
updateBuffers: [0]
}, {
forceKeyframe: false,
temporalId: 0,
spatialId: 1,
resolution: {640, 360},
referenceBuffers: [1],
updateBuffers: [1]
}],
callback);
63
What do you mean, what “fingerprinting”?
64
Discussion (End Time: 16:25)
65
Wrapup and Next Steps
Start Time: 16:25
End Time: 16:30
66
Next Steps
67
Thank you
Special thanks to:
WG Participants, Editors & Chairs
68