1
CS 168, Summer 2025 @ UC Berkeley
Slides credit: Sylvia Ratnasamy, Rob Shakir, Peyrin Kao, Ankit Singla, Murphy McCauley
Datacenter Routing
Lecture 19 (Datacenters 2)
Datacenter Routing
Lecture 19, CS 168, Summer 2025
Datacenter Routing
Datacenter Addressing
Virtualization and Encapsulation
Why are Datacenters Different? – Multiple Paths
Recall: In a Clos network, there are many paths between two servers.
Our routing algorithms so far pick a single path from source to destination.
How do we modify our routing protocols to find multiple paths?
Why are Datacenters Different? – Multiple Paths
We want routing protocols to find multiple paths between two hosts.
2
A
R3
B
2
1
1
1
1
A
B
C
D
Bandwidth:
Coordination:
1 on all links
R1
R2
R4
R1
R2
R3
R4
Why are Datacenters Different? – Multiple Paths
Equal Cost Multi-Path (ECMP) finds all of the shortest paths (with equal cost).
Then, we load-balance packets across those paths.
2
A
R1
R2
R3
R4
B
2
1
1
1
1
A
R1
R2
R3
R4
B
C
D
Bandwidth:
Coordination:
1 on all links
ECMP Load-Balancing
If there are multiple shortest paths, how does the router load-balance packets between those paths?
A
B
Top path is not the shortest. Packets won't be sent this way.
R1 has to load-balance packets out of these 2 links.
R1
Payload
Layer 4 Header
Layer 3 Header
Link 1
Link 3
Link 2
f
=
Link ____
ECMP Load-Balancing Strategy #1 – Round-Robin
Round-robin: Ignore packet contents, and alternate sending between links.
Problem: TCP packet reordering.
Payload
Destination port
Source port
f
=
Link ____
Protocol (TCP/UDP)
Destination IP
Source IP
ECMP Load-Balancing Strategy #2 – Destination-Based
Use destination IP to choose link.
Problem: If lots of sources sending to the same destination, one link is overloaded.
Payload
Destination port
Source port
f
=
Link ____
Protocol (TCP/UDP)
Destination IP
Source IP
ECMP Load-Balancing Strategy #3 – Source-Based
Use source IP to choose link.
Problem: If the same source is sending to lots of destinations, one link is overloaded.
Payload
Destination port
Source port
f
=
Link ____
Protocol (TCP/UDP)
Destination IP
Source IP
ECMP Load-Balancing Strategy #4 – IP-Based
Use source and destination IP to choose link.
Using both values helps spread out packets across links.
Problem: What if there are multiple large flows between the same two servers?
Payload
Destination port
Source port
f
=
Link ____
Protocol (TCP/UDP)
Destination IP
Source IP
ECMP Load-Balancing Strategy #5 – Flow-Based
Use 5 values to choose link:
This is called per-flow load-balancing.
Payload
Destination port
Source port
f
=
Link ____
Protocol (TCP/UDP)
Destination IP
Source IP
Note: This does not account for flows being different sizes. Tracking flow size is more complex, for not a lot of benefit.
Multi-Path Distance-Vector Protocols
How do we adjust distance-vector protocols to support multiple paths?
A
R1
B
R3
R2
R4
I'm R2.�I can reach B with cost 2.
R1's forwarding table | ||
To: | Via: | Cost: |
B | R2 | 3 |
Multi-Path Distance-Vector Protocols
Normal distance-vector:
A
R1
B
R3
R2
R4
I'm R4.�I can reach B with cost 2.
R1's forwarding table | ||
To: | Via: | Cost: |
B | R2 | 3 |
I already have a cost-3 path to B.
Your path is not better, so I'll ignore it.
Multi-Path Distance-Vector Protocols
Multi-path distance-vector:
R1's forwarding table | ||
To: | Via: | Cost: |
B | R2 | 3 |
B | R4 | 3 |
Your path is equally good.�I'll remember both paths.
A
R1
B
R3
R2
R4
I'm R4.�I can reach B with cost 2.
Multi-Path Link-State Protocols
Recall link-state: Each router stores the full network graph.
Datacenter Addressing
Lecture 19, CS 168, Summer 2025
Datacenter Routing
Datacenter Addressing
Virtualization and Encapsulation
Why are Datacenters Different? – Scaling Routing
Scaling routing protocols in datacenters:
Recall: Clos networks scale by using commodity switches.
Topology-Aware Addressing
We can scale routing using hierarchical addressing.
Topology-Aware Addressing
10.1.0.0/16
10.3.1.0/24
10.3.2.0/24
10.4.1.0/24
10.4.2.0/24
10.2.1.0/24
10.2.2.0/24
10.1.1.0/24
10.1.2.0/24
10.1.1.1
10.1.1.2
10.1.2.1
10.1.2.2
10.2.1.1
10.2.1.2
10.2.2.1
10.2.2.2
10.3.1.1
10.3.1.2
10.3.2.1
10.3.2.2
10.4.1.1
10.4.1.2
10.4.2.1
10.4.2.2
R's forwarding table | |
To: | Use: |
10.1.0.0/16 | Link 1 |
10.2.0.0/16 | Link 2 |
10.3.0.0/16 | Link 3 |
10.4.0.0/16 | Link 4 |
10.4.0.0/16
10.3.0.0/16
10.2.0.0/16
R
Topology-Aware Addressing
Routing aggregation makes our forwarding tables:
Nice example of what can be achieved in a controlled network.
R's forwarding table | |
To: | Use: |
10.1.0.0/16 | Link 1 |
10.2.0.0/16 | Link 2 |
10.3.0.0/16 | Link 3 |
10.4.0.0/16 | Link 4 |
Virtualization
Lecture 19, CS 168, Summer 2025
Datacenter Routing
Datacenter Addressing
Virtualization and Encapsulation
Physical Datacenter Limitations
If we hosted applications directly on servers, we'd have some problems.
Fundamental issue:
Physical Datacenter Limitations
Scaling is another problem.
Routing is another problem.
Physical Datacenter Limitations
Applications choosing IP addresses is incompatible with our datacenter addressing.
Applications moving and keeping the same IP address is also incompatible.
R1
R2
R3
R
10.1.0.0/16
10.2.0.0/16
10.3.0.0/16
I am a Google search server, and I want to use IP address 192.0.2.1.
I am a YouTube server, and I want to use IP address 10.16.1.2.
R's forwarding table | |
To: | Next hop: |
10.2.0.0/16 | R2 |
10.3.0.0/16 | R3 |
192.0.2.1 | R1 |
10.16.1.2 | R1 |
We can't aggregate these rows!
Virtual Machines
Virtualization lets us run one or more virtual machines on a single physical machine.
Physical server (hardware)
Hypervisor (software)
VM 1
VM 3
VM 2
I want to write to disk.
VM 1 thinks it's talking to its own hardware disk, but it's actually talking to the hypervisor.
The hypervisor talks to the hardware disk (and ensures each VM gets its own dedicated slice of disk).
Virtual Machines
Why is virtualization useful?
Physical server (hardware)
Hypervisor (software)
VM 1
VM 3
VM 2
Virtual Switches
The physical server has a single network card (NIC) and a single IP address.
Solution: Run a virtual switch in software, on the server.
Server 1.1.1.1
VM1
R1
Virtual Switch
VM2
192.0.2.1
192.168.1.2
10.16.1.2
VM3
Note: Virtual switches can run in software on a general-purpose CPU because they serve less traffic than a real-world hardware switch. (Just the traffic from the VMs on that server.)
Overlay and Underlay Networks
Lecture 19, CS 168, Summer 2025
Datacenter Routing
Datacenter Addressing
Virtualization and Encapsulation
Routing with Virtualization
VMs let us easily add/remove servers, and use server resources efficiently.
But we still haven't solved our routing problem.
Server 1.1.1.1
VM1
R1
Virtual Switch
VM2
192.0.2.1
192.168.1.2
10.16.1.2
VM3
Problem: Routing with Virtualization
Key problem: We have 2 different addressing systems to think about.
The underlay network thinks in terms of physical addresses.
The overlay network (VMs) thinks in terms of virtual addresses.
Underlay
Overlay
Server 1 1.1.1.1
R1
R2
R3
R4
Virtual Switch
V2
V1
V3
192.0.2.1
192.168.1.2
10.16.1.2
Server 2 2.2.2.2
Virtual Switch
V5
V4
V6
10.7.7.7
10.8.8.8
192.0.5.7
Problem: Routing with Virtualization
Ideally, we want to think about each layer separately.
Underlay
Overlay
Server 1 1.1.1.1
R1
R2
R3
R4
Virtual Switch
V2
V1
V3
192.0.2.1
192.168.1.2
10.16.1.2
Server 2 2.2.2.2
Virtual Switch
V5
V4
V6
10.7.7.7
10.8.8.8
192.0.5.7
Solution: A New Layer
How do we bridge the gap between the overlay and underlay?
Use the same strategies with layering and header from the Internet design!
The new layer could be a second IP header, or a new kind of header (not IP).
Payload
TCP Header
IP (Overlay) Header
IP (Underlay) Header
Payload
TCP Header
IP Header
Original design.
The new design.
Encapsulation and Decapsulation
Let's see how to use the new layer to connect the overlay and underlay networks.
Underlay
Overlay
Server 1 1.1.1.1
R1
R2
R3
R4
Virtual Switch
V2
V1
V3
192.0.2.1
192.168.1.2
10.16.1.2
Server 2 2.2.2.2
Virtual Switch
V5
V4
V6
10.7.7.7
10.8.8.8
192.0.5.7
Encapsulation and Decapsulation
Our goal: VM1 wants to talk to VM6.
Underlay
Overlay
Server 1 1.1.1.1
R1
R2
R3
R4
Virtual Switch
V2
V3
192.0.2.1
192.168.1.2
10.16.1.2
Server 2 2.2.2.2
Virtual Switch
10.7.7.7
10.8.8.8
192.0.5.7
V1
V5
V4
V6
Payload
Encapsulation and Decapsulation (Step 1/5)
VM1 adds an overlay header with the destination's virtual address.
Then, VM1 passes the packet to the virtual switch.
Underlay
Overlay
Server 1 1.1.1.1
R1
R2
R3
R4
Virtual Switch
V2
V3
192.0.2.1
192.168.1.2
10.16.1.2
Server 2 2.2.2.2
Virtual Switch
To: 192.0.5.7
Payload
10.7.7.7
10.8.8.8
192.0.5.7
V5
V4
V6
V1
Encapsulation and Decapsulation (Step 2/5)
The virtual switch reads the virtual address and looks up the matching physical address. Then, it adds (encapsulates) a new header with the physical address.
Then, the virtual switch forwards the packet to routers in the datacenter.
Underlay
Overlay
Server 1 1.1.1.1
R1
R2
R3
R4
V2
V1
V3
192.0.2.1
192.168.1.2
10.16.1.2
Server 2 2.2.2.2
Virtual Switch
To: 192.0.5.7
Payload
10.7.7.7
10.8.8.8
192.0.5.7
V5
V4
V6
To: 2.2.2.2
Virtual Switch
Encapsulate
Haven't discussed how to look up yet. For now, it's magic.
Encapsulation and Decapsulation (Step 3/5)
The routers in the datacenter forward the packet according to its physical (underlay) address. No need to think about virtual addresses!
Underlay
Overlay
Server 1 1.1.1.1
R1
R2
R3
R4
Virtual Switch
V2
V1
V3
192.0.2.1
192.168.1.2
10.16.1.2
Server 2 2.2.2.2
Virtual Switch
To: 192.0.5.7
Payload
10.7.7.7
10.8.8.8
192.0.5.7
V5
V4
V6
To: 2.2.2.2
Encapsulation and Decapsulation (Step 4/5)
Eventually, R4 receives the packet and reads its physical (underlay) destination address, 2.2.2.2.
R4 is connected to physical server 2.2.2.2, so it forwards the packet to the server.
Underlay
Overlay
Server 1 1.1.1.1
R1
R2
R3
R4
Virtual Switch
V2
V1
V3
192.0.2.1
192.168.1.2
10.16.1.2
Server 2 2.2.2.2
Virtual Switch
10.7.7.7
10.8.8.8
192.0.5.7
V5
V4
V6
To: 192.0.5.7
Payload
To: 2.2.2.2
Encapsulation and Decapsulation (Step 5/5)
The virtual switch at 2.2.2.2 sees a packet destined for itself.
The virtual switch removes (decapsulates) the underlay header, revealing the virtual address of the destination.
Then, the virtual switch sends the packet to the VM with virtual address 192.0.5.7.
Underlay
Overlay
Server 1 1.1.1.1
R1
R2
R3
R4
Virtual Switch
V2
V1
V3
192.0.2.1
192.168.1.2
10.16.1.2
Server 2 2.2.2.2
10.7.7.7
10.8.8.8
192.0.5.7
V5
V4
V6
To: 192.0.5.7
Payload
To: 2.2.2.2
Virtual Switch
Decapsulate
Encapsulation and Decapsulation
Success – our packet reached VM6!
Underlay
Overlay
Server 1 1.1.1.1
R1
R2
R3
R4
Virtual Switch
V2
V1
V3
192.0.2.1
192.168.1.2
10.16.1.2
Server 2 2.2.2.2
10.7.7.7
10.8.8.8
192.0.5.7
V5
V4
V6
Virtual Switch
Payload
To: 192.0.5.7
Encapsulation and Decapsulation
Why did this work?
Underlay
Overlay
Server 1 1.1.1.1
R1
R2
R3
R4
Virtual Switch
V2
V1
V3
192.0.2.1
192.168.1.2
10.16.1.2
Server 2 2.2.2.2
Virtual Switch
V5
V4
V6
10.7.7.7
10.8.8.8
192.0.5.7
Encapsulation and Decapsulation
Encapsulation: Adding the extra header.
Decapsulation: Removing the extra header, exposing the original header underneath.
To: 192.0.5.7
Payload
To: 2.2.2.2
To: 192.0.5.7
Payload
To: 2.2.2.2
Original packet only has virtual (overlay) address.
To: 192.0.5.7
Payload
We add the physical (underlay) address.
The extra header helps the packet travel through the underlay network.
Eventually, we remove the extra header.
The packet travels based on the virtual (overlay) address the rest of the way.
Encapsulation and Decapsulation
Underlay
Overlay
Server 1 1.1.1.1
R1
R2
R3
R4
Virtual Switch
V2
V1
V3
192.0.2.1
192.168.1.2
10.16.1.2
Server 2 2.2.2.2
Virtual Switch
V5
V4
V6
10.7.7.7
10.8.8.8
192.0.5.7
R2's forwarding table | |
To: | Next hop: |
1.1.1.1 | R1 |
2.2.2.2 | R3 |
Only includes physical addresses (which can be aggregated!)
VM1's forwarding table | |
To: | Next hop: |
Anywhere | Virtual switch |
Virtual switch's forwarding table | |
To: | Next hop: |
192.0.5.7 | Add header: 2.2.2.2 Then, send to R1 |
Haven't discussed how to map 192.0.5.7 → 2.2.2.2 yet.�For now, it's magic.
Multi-Tenancy and Private Networks
Lecture 19, CS 168, Summer 2025
Datacenter Routing
Datacenter Addressing
Virtualization and Encapsulation
Multi-Tenancy
Datacenters are owned by a single operator, but they can have multiple tenants.
Different companies/departments are running on the same infrastructure.
(AWS = Amazon Web Services. GCP = Google Cloud Platform.)
Private IP Addressing
Different tenants don't coordinate when choosing addresses. Why is this a problem?
Two hosts (from two different tenants) could have the same IP address.
Routing with Multi-Tenancy
In this datacenter, Coke and Pepsi are two separate tenants.
Underlay
Overlay
Server 1 1.1.1.1
R1
R2
R3
R4
Virtual Switch
P1
C1
192.0.2.1
192.0.2.1
Server 2 2.2.2.2
Virtual Switch
P2
C2
192.0.2.2
192.0.2.2
To: 192.0.2.2
Payload
To: 2.2.2.2
Encapsulations for Multi-Tenancy
Solution: Use encapsulation again!
Underlay
Overlay
Server 1 1.1.1.1
R1
R2
R3
R4
Virtual Switch
C1
192.0.2.1
192.0.2.1
Server 2 2.2.2.2
Virtual Switch
P2
C2
192.0.2.2
192.0.2.2
To: 192.0.2.2
Payload
P
To: 2.2.2.2
P1
Encapsulations for Multi-Tenancy
Solution: Use encapsulation again!
Underlay
Overlay
Server 1 1.1.1.1
R1
R2
R3
R4
P1
C1
192.0.2.1
192.0.2.1
Server 2 2.2.2.2
P2
C2
192.0.2.2
192.0.2.2
To: 192.0.2.2
Payload
C
To: 2.2.2.2
Virtual Switch
Virtual Switch
Encapsulations for Multi-Tenancy
This new extra header lets us distinguish between tenants.
To: 192.0.5.7
Payload
C
To: 2.2.2.2
To: 192.0.5.7
Payload
P
To: 2.2.2.2
Putting It Together – Stacking Encapsulations
We can use encapsulation for both virtualization and multi-tenancy.
Payload
TCP Header
IP (Overlay) Header
Payload
TCP Header
IP (Overlay) Header
Virtual Network Header
Payload
TCP Header
IP (Overlay) Header
Virtual Network Header
IP (Underlay) Header
Original packet from application.
Encapsulate:�Add virtual network context.
Encapsulate:�Add underlay destination.
Putting It Together – Stacking Encapsulations
We can use encapsulation for both virtualization and multi-tenancy.
Payload
TCP Header
IP (Overlay) Header
Virtual Network Header
Payload
TCP Header
IP (Overlay) Header
Virtual Network Header
IP (Underlay) Header
Decapsulate:�Expose virtual network header, decide which tenant to forward to.
Receive packet from underlay.
Payload
TCP Header
IP (Overlay) Header
Decapsulate:�Expose overlay header, forward to corresponding VM.
Implementing Encapsulation
What real-world protocols exist for adding extra headers?
Don't worry about the details – the idea of encapsulation is more important than the specific implementation.
Summary: Routing and Encapsulation in Datacenters