2 of 8

Network Load Balancing and the Data Center

Modern data centers have grown to serve heavy traffic with upwards of many millions of requests a second and thousands of servers
Load balancers must distribute incoming network traffic in a way that maximizes speed and capacity utilization
Load balancers sit between servers and client requests and route traffic across the servers most capable of fulfilling them
The importance of this infrastructure has led to significant research around how to improve current designs

https://cloud.google.com/compute/docs/load-balancing/images/http_load_balancing_cross_region.png

3 of 8

Traditional Load Balancing

Two dominant load balancing architecture models prevalent in recent years

“Layer 4” - commonly hardware based, can be implemented from other layers below 4, the names are only loosely related to the OSI network model
“Layer 7” - software/application layer based

Traditional implementations of load balancers

Mainly involved dedicated hardware devices
Often proprietary and provided by a vendor
Require less computation overhead than sophisticated layer 7 approaches
Were popular when commodity servers did not have the power they do now and interactions between clients and servers were more complex

4 of 8

Load Balancing Today

Hardware based solutions had several disadvantages

Low serviceability and deployability
Employed costly specialized hardware
Constrained redundancy
Less flexible

Thus, modern approaches tend to take a layer 7 based approach

Can run on commodity hardware
Have access to application layer information

ECMP (Equal Cost Multi-Path)

A very popular approach to load balancing in data centers today (Google Maglev)
Randomly hashes a packet to a path in the network based on information in the header and is able to choose to route among several “best” paths
Easily implemented

https://en.wikipedia.org/wiki/Equal-cost_multi-path_routing#/media/File:802d1aqECMP.gif

5 of 8

FlowBender

Software based solution, no hardware changes required - very minimal changes required overall
Addresses ECMP’s static flow to path assignment that does not offer flexibility in the face of oversubscribed network paths
Dynamically reacts to network conditions only in the face of congestion or link failure (notified of congestion via ECN and failures using TCP timeouts)
Recomputes hashed path from another field in the header (such as TTL)
Uses commodity based ECMP outside of rerouting
Avoids excessive packet reordering - a known problem in multipath routing - by rerouting only when it is necessary
Avoids resorting to sub flow granularity routing that other proposals have required - leads to excessive packet reordering

6 of 8

Presto

Addresses the weakness of ECMP to long flows where lack of packet header entropy can lead to collisions and, by extension, congestion
Also a software based implementation
Load balances at the granularity of “flowcells” by dividing up flows at the vSwitches - used uniformly sized subflows of size 64KB

Chosen size ensures elephant flows will be broken up and majority of mice flows will not

Breaking up flows onto different paths requires handling reordering
Designers modified the Generic Receive Offload (GRO) flush algo in the hypervisor OS to react less aggressively with pushing up segments and use a wait and see approach

Related to LRO - used to aggregate packets into a buffer to push up all at once avoiding many interrupts and increasing throughput as well as lowering CPU overhead

GRO would then sort packets arriving out of order, or push them up to TCP if lost
Implemented end-to-end (centralized) multipathing to allow for greater control/scheduling

7 of 8

Critique/Reflection

Presto required a relatively more complex set of modifications, but seemed to track toward optimal performance in evaluations
Presto’s evaluation results seemed to be too good to be true
Presto’s evaluation of the CPU cost for the GRO modifications seemed optimistic and was not evaluated under a configuration that involved reordering
Presto’s evaluation did not test vs many other more recently proposed systems

FlowBender was a clever design that did the most with available resources and with performance that tracked with more costly proposals and beat ECMP by large margins
Did not have to rely on breaking up flows which meant less reordering overhead required
Did not present much in the way of a rationale for choosing comparison implementations (DeTail and RPS)

8 of 8

Sources

Eisenbud, D. (2016). Maglev: A Fast and Reliable Software Network Load Balancer. 13th USENIX

Symposium on Networked Systems Design and Implementation (NSDI 16) (pp. 523-535). Santa Clara:

USENIX Association.

He, K. (2015). Presto: Edge-based Load Balancing for Fast Datacenter Networks. SIGCOMM '15

Proceedings of the 2015 ACM Conference on Special Interest Group on Data Communication

(pp. 465-478). London, United Kingdom: ACM.

Kabanni, A., & all., e. (2014). FlowBender: Flow-level Adaptive Routing for Improved Latency and Throughput in

Datacenter Networks. CoNEXT '14 Proceedings of the 10th ACM International on Conference

on emerging Networking Experiments and Technologies (pp. 149-160). Syndey Australia: ACM.

Raiciu, C. (2011). Improving Datacenter Performance and Robustness with Multipath TCP. SIGCOMM '11

Proceedings of the ACM SIGCOMM 2011 conference (pp. 266-277). Toronto Ontario, Canada: ACM.

What is Layer 4 Load Balancing? (n.d.). Retrieved from NGINX:

https://www.nginx.com/resources/glossary/layer-4-load-balancing/

What is Load Balancing? (n.d.). Retrieved from NGINX:

https://www.nginx.com/resources/glossary/load-balancing/