1 of 26

Architecting your WebRTC application for scalability

Alberto Gonzalez and Arin Sime

TADSummit

Nov 8-9, 2022

2 of 26

How to Architect your WebRTC application for scalability!

@WebRTCventures

Alberto Gonzalez, CTO

Arin Sime, CEO

Agenda

  • Why it’s not easy to build scalable apps with WebRTC
  • Open Source vs CPaaS
  • So I just need an SFU or MCU to scale my app?
  • RTC orchestration and containers
  • Stickiness, persistence and load testing
  • Optimizations: App optimizations, media optimizations (e.g: codecs)

3 of 26

WebRTC is not quite this simple…

  • STUN/TURN servers
  • Application Signaling
  • Video codecs
  • Browser/Mobile Support
  • Recording
  • Group chat/scaling
  • Broadcasting

@WebRTCventures

4 of 26

4 Ways to build your app…

  1. To the standard, ie “build your own stack”
  2. Unbundled WebRTC
  3. Open source media servers
  4. CPaaS – Communications Platform as a Service

@WebRTCventures

5 of 26

#1 – Building to the WebRTC Standard

  • Compiling webrtc lib
  • STUN/TURN servers
  • Application Signaling
  • Video/audio codecs
  • Group chat/scaling
  • Browser/Mobile Support
  • Recording/Other Add-on features
  • Can better utilize capabilities like WebCodecs, WebTransport and control low level details for specific use cases

You must build and handle of the following – with great power comes great responsibility!

https://webrtc.org

@WebRTCventures

6 of 26

#2 – Unbundling WebRTC

May be appropriate when you find yourself saying “I wish WebRTC would do this instead…”

@WebRTCventures

Typical WebRTC

media server

Capture

Encode

Send

Receive

Decode

Play

WebAssembly

media server

Capture

Web

Codecs

Web Transport

Web Transport

Web Codecs

Play

Diagram adapted from a presentation by Tsahi Levent-Levi on WebRTC Live, bloggeek.me

Unbundled WebRTC - Allows use of other standards to have more control of the codecs, transport, as well as add in insertable streams

7 of 26

#3 – Open Source Media Servers

Media Servers will handle:

  • video/audio details
  • part or all of the signaling
  • Possibly STUN, TURN
  • Scaling capabilities
  • Could be SFUs or MCUs
  • Browser/Mobile support

But you host/manage:

  • All infrastructure and updates

Media Servers

Your cloud servers

@WebRTCventures

8 of 26

#3 – Open Source Media Servers

Media Servers

Your cloud servers

janus.conf.meetecho.com

jitsi.org

@WebRTCventures

Popular examples:

Pion.ly

mediasoup.org

LiveKit.io

9 of 26

#4 – CPaaS – Communications Platforms

Your Application servers

A CPaaS will handle:

  • All WebRTC support / updates
  • Media Servers
  • STUN/TURN
  • Web/Mobile Support
  • Additional features like Recording, SMS, Voice/VOIP, Transcription, etc

But you pay according to usage

CPaaS

@WebRTCventures

10 of 26

#4 – CPaaS – Communications Platforms

Your Application servers

CPaaS

@WebRTCventures

Popular examples:

11 of 26

It’s all about tradeoffs…

WebRTC architecture

WebRTC Standard

Unbundled WebRTC

Open Source Media Servers

CPaaS

Up front cost

High

High

Medium

Low

Ongoing cost

Low

Low

Low

High

Technical difficulty

High

Medium-High

Medium

Low

Features included

Low

High*

Medium

High

@WebRTCventures

*Not really included, but you have flexibility to build your own on top of the underlying APIs

And what about intellectual property?

Also, what works for you today does not have to be your long term choice.

12 of 26

WebRTC Scaling Challenges

@WebRTCventures

13 of 26

SFUs or MCUs can help scale WebRTC

MCU – Multipoint Control Unit

  • Handles mixing of video/audio streams in a central server so each participant only has one stream to deal with

SFU – Selective Forwarding Unit

  • Each participant only connects to the SFU, but receives unique streams for each participant

Either can add features beyond scaling

  • Recording
  • Broadcasting
  • Interface to other services like transcription or VoIP legacy systems

@WebRTCventures

14 of 26

MCU example

  • Multipoint Control Unit
  • Central server mixes all audio and video
  • Participants only gets one downloaded stream each for audio and video
  • MCU controls a composited layout of that video for everyone, which can be nice but also introduces latency
  • Heavy processing is required on a MCU, but offers more predictable bandwidth requirements

Media Servers offering MCU capability (not a comprehensive list):

MCU

@WebRTCventures

15 of 26

SFU example

  • Selective Forwarding Unit
  • Routes the correct stream to each user
  • Still unique streams for each participant (allows for layout changes on user side)
  • More powerful and more modern option but more complicated implementation
  • Lower server CPU required but more variable bandwidth (based on # of users)
  • Possible to do end-to-end encryption

Media Servers offering SFU capability (not a comprehensive list):

SFU

@WebRTCventures

16 of 26

Scaling beyond single media server applications

Depends on the use case… What happens if we have 1000+ viewers?

For large broadcasting applications:

@WebRTCventures

SFU

SFU

SFU

17 of 26

Scaling beyond single media server applications

For large multiparty video conferencing applications:

@WebRTCventures

SFU

SFU

18 of 26

Scaling beyond single media server applications

Video group calls with telephony integration

@WebRTCventures

MCU

IP-PBX

Phone caller

User 2

Web Publishers

User 1

User 1

Web

Web

Publishers

SFU

User 2

Phone caller

User 3

IP-PBX

User 1

MCU

SIP/RTP

SIP/RTP

RTP

WebRTC

19 of 26

Large Video Conferencing Architecture considerations

  • Multiparty video conferencing support?

  • Integration of multiple channels

  • Integration with VoIP legacy systems

  • Recording/voicemail and speech to text

@WebRTCventures

20 of 26

Orchestration and Containers in WebRTC applications to Achieve Horizontal Scalability

Challenges

  • Decouple media server from application logic

  • Stateful system complexities

  • Autoscaling / Downscaling

  • Overprovisioning

@WebRTCventures

21 of 26

WebRTC Scalability Autoscaling Rules

Planning your autoscaling rules

  • Connections threshold for autoscaling
    • More accurate than CPU/bandwidth

  • Maximum number of sessions/rooms per server

  • Maximum users per room
    • To make sure we can predict

  • Desired resources buffer for quick spikes:
    • 1, 2 or even 10 servers ready?

@WebRTCventures

Example of users joining media servers at a different pace

22 of 26

WebRTC scalability, stickiness and persistence

Sticky Sessions

  • We need all users in a call to use the same media server
  • Generally needs additional app logic build to distribute traffic accordingly
  • Approaches:
    • Cookie based load balanced sticky sessions
    • Direct routing through initial auth

Data Persistence

  • All servers need to be aware of the current situation of the connections
  • DB or Cache based storage systems can be used for storing sessions information and distribute traffic
  • PubSub mechanisms can be a good addition to decouple and scale independently

@WebRTCventures

Basic WebRTC Scalability and High Availability Architecture

23 of 26

WebRTC load testing: testing your scalable application

@WebRTCventures

Approaches

  • Build your own

  • Open Source

  • Third party platforms

What do we want to validate?

  • Connections and media received/sent

  • Jitter/Round Trip Time (RTT)/Packet Loss

  • Acceptable Mean Opinion Score (MOS)

24 of 26

Application and Media Optimizations today

What can you do?

  • Simulcast or SVC

  • Audio detection

  • Adaptive bitrate based on resolution

  • Opus RED and DTX (Discontinuous Transmission)

@WebRTCventures

WebRTC SFU

SVC example

WebRTC SFU

Audio #1

Audio #2

Audio #4

Audio #3

A#3

A#2

A#1

Receiving Opus RED (Redundant Audio Data) example

Missing packet

25 of 26

Application and Media Optimizations tomorrow

What will be recommended soon?

  • AV1 video codec*

  • Lyra V2 audio codec*

  • Other ML optimizations (e.g: Noise Reduction or packet loss concealment)

@WebRTCventures

*It is possible to use it but performance encoding is not great due to average hardware not being ready and some browsers and devices don’t support it yet

Lyra v2 Google open source results: https://opensource.googleblog.com/2022/09/lyra-v2-a-better-faster-and-more-versatile-speech-codec.html

26 of 26

Thank you!

Learn more about us:

https://webrtc.ventures

Follow us on Twitter:

@WebRTCventures

Experts in live video app development for:

Telehealth, Broadcasting, Contact Centers, and More!

@lbertogon

@arinsime

Contact us at team@webrtc.ventures