1 of 31

Computer Networks

Sina Keshvadi

Fall 2021

University of Calgary

Lecture No. 19

Oct 22, 2021

TCP Congestion Control

Transport Layer: 3-1

Today’s Agenda:

  1. TCP Congestion Control
  2. TCP in Wireshark

  • Assignment 2 Q/A

2 of 31

TCP slow start

Transport Layer: 3-2

  • when connection begins, increase rate exponentially until first loss event:
    • initially cwnd = 1 MSS
    • double cwnd every RTT
    • done by incrementing cwnd for every ACK received

Host A

one segment

Host B

RTT

time

two segments

four segments

  • summary: initial rate is slow, but ramps up exponentially fast

3 of 31

TCP: from slow start to congestion avoidance

Transport Layer: 3-3

Implementation:

  • variable ssthresh
  • on loss event, ssthresh is set to 1/2 of cwnd just before loss event

Q: when should the exponential increase switch to linear?

A: when cwnd gets to 1/2 of its value before timeout.

X

4 of 31

Congestion Avoidance

On entry to the congestion-avoidance state, the value of cwnd is approximately half its value when congestion was last encountered.

TCP adopts a more conservative approach and increases the value of cwnd by just a single MSS every RTT.

Transport Layer: 3-4

5 of 31

Fast Recovery

  • In fast recovery, the value of cwnd is increased by 1 MSS for every duplicate ACK received for the missing segment that caused TCP to enter the fast-recovery state.
  • If a timeout event occurs, fast recovery transitions to the slow-start state
  • Fast recovery is a recommended, but not required, component of TCP.

Transport Layer: 3-5

6 of 31

TCP: detecting, reacting to loss

  • loss indicated by timeout:
    • cwnd set to 1 MSS
    • window then grows exponentially (as in slow start) to threshold, then grows linearly

  • loss indicated by 3 duplicate ACKs: TCP RENO
    • dup ACKs indicate network capable of delivering some segments
    • cwnd is cut in half window then grows linearly

  • TCP Tahoe always sets cwnd to 1 (timeout or 3 duplicate acks)

Transport Layer: 3-6

7 of 31

With and Without Fast Recovery

Fast recovery is a recommended, but not required, component of TCP.

  1. Tahoe: if three duplicate ACKs are received:
    1. performs a fast retransmit
    2. sets the slow start threshold to half of the current congestion window
    3. reduces the congestion window to 1 MSS
    4. resets to slow start state.

  • Reno: if three duplicate ACKs are received,
    • reduces the congestion window to half
    • setting the slow start threshold equal to the new congestion window
    • enter a phase called fast recovery.

In both Tahoe and Reno, if an ACK times out (RTO timeout), slow start is used, and both algorithms reduce congestion window to 1 MSS.

3-7

8 of 31

TCP congestion control: AIMD

Transport Layer: 3-8

  • approach: senders can increase sending rate until packet loss (congestion) occurs, then decrease sending rate on loss event

AIMD sawtooth

behavior: probing

for bandwidth

TCP sender Sending rate

time

increase sending rate by 1 maximum segment size every RTT until loss detected

Additive Increase

cut sending rate in half at each loss event

Multiplicative Decrease

9 of 31

TCP AIMD: more

Transport Layer: 3-9

Multiplicative decrease detail: sending rate is

  • Cut in half on loss detected by triple duplicate ACK (TCP Reno)
  • Cut to 1 MSS (maximum segment size) when loss detected by timeout (TCP Tahoe)

Why AIMD?

  • AIMD – a distributed, asynchronous algorithm – has been shown to:
    • optimize congested flow rates network wide!
    • have desirable stability properties

10 of 31

Summary: TCP congestion control

Transport Layer: 3-10

timeout

ssthresh = cwnd/2

cwnd = 1 MSS

dupACKcount = 0

retransmit missing segment

Λ

cwnd > ssthresh

congestion

avoidance

cwnd = cwnd + MSS (MSS/cwnd)

dupACKcount = 0

transmit new segment(s), as allowed

new ACK

.

dupACKcount++

duplicate ACK

fast

recovery

cwnd = cwnd + MSS

transmit new segment(s), as allowed

duplicate ACK

ssthresh= cwnd/2

cwnd = ssthresh + 3

retransmit missing segment

dupACKcount == 3

timeout

ssthresh = cwnd/2

cwnd = 1

dupACKcount = 0

retransmit missing segment

ssthresh= cwnd/2

cwnd = ssthresh + 3

retransmit missing segment

dupACKcount == 3

cwnd = ssthresh

dupACKcount = 0

New ACK

slow

start

timeout

ssthresh = cwnd/2

cwnd = 1 MSS

dupACKcount = 0

retransmit missing segment

cwnd = cwnd+MSS

dupACKcount = 0

transmit new segment(s), as allowed

new ACK

dupACKcount++

duplicate ACK

Λ

cwnd = 1 MSS

ssthresh = 64 KB

dupACKcount = 0

New

ACK!

New

ACK!

New

ACK!

11 of 31

TCP congestion control: details

Transport Layer: 3-11

  • TCP sender limits transmission:
  • cwnd is dynamically adjusted in response to observed network congestion (implementing TCP congestion control)

LastByteSent- LastByteAcked

<

cwnd

last byte

ACKed

last byte sent

cwnd

sender sequence number space

available but not used

TCP sending behavior:

  • roughly: send cwnd bytes, wait RTT for ACKS, then send more bytes

TCP rate

~

~

cwnd

RTT

bytes/sec

sent, but not-yet ACKed

(“in-flight”)

12 of 31

TCP throughput

  • avg. TCP thruput as function of window size, RTT?
    • ignore slow start, assume always data to send
  • W: window size (measured in bytes) where loss occurs
    • avg. window size (# in-flight bytes) is ¾ W
    • avg. thruput is 3/4W per RTT

Transport Layer

3-12

W

W/2

avg TCP thruput =

3

4

W

RTT

bytes/sec

13 of 31

TCP Futures: TCP over “long, fat pipes”

  • example: 1500 byte segments, 100ms RTT, want 10 Gbps throughput
  • requires W = 83,333 in-flight segments
  • throughput in terms of segment loss probability, L [Mathis 1997]:���

to achieve 10 Gbps throughput, need a loss rate of L = 2·10-10 – a very small loss rate!

  • new versions of TCP for high-speed

Transport Layer

3-13

TCP throughput =

1.22

.

MSS

RTT

L

14 of 31

TCP CUBIC

Transport Layer: 3-14

  • Is there a better way than AIMD to “probe” for usable bandwidth?

Wmax

Wmax/2

classic TCP

TCP CUBIC - higher throughput in this example

  • Insight/intuition:
    • Wmax: sending rate at which congestion loss was detected
    • congestion state of bottleneck link probably (?) hasn’t changed much

    • after cutting rate/window in half on loss, initially ramp to to Wmax faster, but then approach Wmax more slowly

15 of 31

TCP CUBIC

Transport Layer: 3-15

  • K: point in time when TCP window size will reach Wmax
    • K itself is tuneable

    • larger increases when further away from K
    • smaller increases (cautious) when nearer K

TCP

sending

rate

time

TCP Reno

TCP CUBIC

Wmax

t0

t1

t2

t3

t4

  • TCP CUBIC default in Linux, most popular TCP for popular Web servers

  • increase W as a function of the cube of the distance between current time and K

16 of 31

TCP and the congested “bottleneck link”

Transport Layer: 3-16

  • TCP (classic, CUBIC) increase TCP’s sending rate until packet loss occurs at some router’s output: the bottleneck link

source

application

TCP

network

link

physical

destination

application

TCP

network

link

physical

bottleneck link (almost always busy)

packet queue almost never empty, sometimes overflows packet (loss)

17 of 31

TCP and the congested “bottleneck link”

Transport Layer: 3-17

  • TCP (classic, CUBIC) increase TCP’s sending rate until packet loss occurs at some router’s output: the bottleneck link

source

application

TCP

network

link

physical

destination

application

TCP

network

link

physical

  • understanding congestion: useful to focus on congested bottleneck link

insight: increasing TCP sending rate will not increase end-end throughout with congested bottleneck

insight: increasing TCP sending rate will increase measured RTT

RTT

Goal: “keep the end-end pipe just full, but not fuller”

18 of 31

Delay-based TCP congestion control

Transport Layer: 3-18

Keeping sender-to-receiver pipe “just full enough, but no fuller”: keep bottleneck link busy transmitting, but avoid high delays/buffering

RTTmeasured

Delay-based approach:

  • RTTmin - minimum observed RTT (uncongested path)
  • uncongested throughput with congestion window cwnd is cwnd/RTTmin

if measured throughput “very close” to uncongested throughput

increase cwnd linearly /* since path not congested */

else if measured throughput “far below” uncongested throughout

decrease cwnd linearly /* since path is congested */

RTTmeasured

measured

throughput

=

# bytes sent in last RTT interval

19 of 31

Delay-based TCP congestion control

Transport Layer: 3-19

  • congestion control without inducing/forcing loss
  • maximizing throughout (“keeping the just pipe full… ”) while keeping delay low (“…but not fuller”)
  • a number of deployed TCPs take a delay-based approach
    • BBR deployed on Google’s (internal) backbone network

20 of 31

Explicit congestion notification (ECN)

Transport Layer: 3-20

source

application

TCP

network

link

physical

destination

application

TCP

network

link

physical

TCP deployments often implement network-assisted congestion control:

  • two bits in IP header (ToS field) marked by network router to indicate congestion
    • policy to determine marking chosen by network operator
  • congestion indication carried to destination
  • destination sets ECE bit on ACK segment to notify sender of congestion
  • involves both IP (IP header ECN bit marking) and TCP (TCP header C,E bit marking)

ECN=10

ECN=11

ECE=1

IP datagram

TCP ACK segment

21 of 31

TCP fairness

Transport Layer: 3-21

Fairness goal: if K TCP sessions share same bottleneck link of bandwidth R, each should have average rate of R/K

TCP connection 1

bottleneck

router

capacity R

TCP connection 2

22 of 31

Q: is TCP Fair?

Transport Layer: 3-22

Example: two competing TCP sessions:

  • additive increase gives slope of 1, as throughout increases
  • multiplicative decrease decreases throughput proportionally

R

R

equal bandwidth share

Connection 1 throughput

Connection 2 throughput

congestion avoidance: additive increase

loss: decrease window by factor of 2

congestion avoidance: additive increase

loss: decrease window by factor of 2

A: Yes, under idealized assumptions:

  • same RTT
  • fixed number of sessions only in congestion avoidance

Is TCP fair?

23 of 31

Fairness: must all network apps be “fair”?

Transport Layer: 3-23

Fairness and UDP

  • multimedia apps often do not use TCP
    • do not want rate throttled by congestion control
  • instead use UDP:
    • send audio/video at constant rate, tolerate packet loss
  • there is no “Internet police” policing use of congestion control

Fairness, parallel TCP connections

  • application can open multiple parallel connections between two hosts
  • web browsers do this , e.g., link of rate R with 9 existing connections:
    • new app asks for 1 TCP, gets rate R/10
    • new app asks for 11 TCPs, gets R/2

24 of 31

Chapter 3: summary

Transport Layer: 3-24

  • principles behind transport layer services:
    • multiplexing, demultiplexing
    • reliable data transfer
    • flow control
    • congestion control
  • instantiation, implementation in the Internet
    • UDP
    • TCP

Up next:

  • leaving the network “edge” (application, transport layers)
  • into the network “core”
  • two network-layer chapters:
    • data plane
    • control plane

25 of 31

*************** NEW FALL 2020 ***************

Have a friend who forgot their CPSC password?

Know a classmate who still needs a CPSC account?

Tell them to skip the IT support queue and go direct to self-service!

1) Log into password.ucalgary.ca with IT account credentials

2) Under "Other Accounts", search for "Computer Science Account"

(Email scihelp@ucalgary.ca if a CPSC account is not listed)

3) Click "Change password" and follow the prompts

***********************************************

3-25

26 of 31

Transport layer: roadmap

  • Transport-layer services
  • Multiplexing and demultiplexing
  • Connectionless transport: UDP
  • Principles of reliable data transfer
  • Connection-oriented transport: TCP
  • Principles of congestion control
  • TCP congestion control
  • Evolution of transport-layer functionality

Transport Layer: 3-26

27 of 31

Evolving transport-layer functionality

  • TCP, UDP: principal transport protocols for 40 years
  • different “flavors” of TCP developed, for specific scenarios:

Transport Layer: 3-27

  • moving transport–layer functions to application layer, on top of UDP
    • HTTP/3: QUIC

Scenario

Challenges

Long, fat pipes (large data transfers)

Many packets “in flight”; loss shuts down pipeline

Wireless networks

Loss due to noisy wireless links, mobility; TCP treat this as congestion loss

Long-delay links

Extremely long RTTs

Data center networks

Latency sensitive

Background traffic flows

Low priority, “background” TCP flows

28 of 31

QUIC: Quick UDP Internet Connections

  • application-layer protocol, on top of UDP
    • increase performance of HTTP
    • deployed on many Google servers, apps (Chrome, mobile YouTube app)

Transport Layer: 3-28

IP

TCP

TLS

HTTP/2

IP

UDP

QUIC

HTTP/2 (slimmed)

Network

Transport

Application

HTTP/2 over TCP

HTTP/3

HTTP/2 over QUIC over UDP

29 of 31

QUIC: Quick UDP Internet Connections

  •  multiple application-level “streams” multiplexed over single QUIC connection
    • separate reliable data transfer, security
    • common congestion control

Transport Layer: 3-29

adopts approaches we’ve studied in this chapter for connection establishment, error control, congestion control

    • error and congestion control: “Readers familiar with TCP’s loss detection and congestion control will find algorithms here that parallel well-known TCP ones.” [from QUIC specification]
    • connection establishment: reliability, congestion control, authentication, encryption, state established in one RTT

30 of 31

QUIC: Connection establishment

Transport Layer: 3-30

TCP handshake

(transport layer)

TLS handshake

(security)

TCP (reliability, congestion control state) + TLS (authentication, crypto state)

  • 2 serial handshakes

data

QUIC handshake

data

QUIC: reliability, congestion control, authentication, crypto state

  • 1 handshake

31 of 31

QUIC: streams: parallelism, no HOL blocking

Transport Layer: 3-31

(a) HTTP 1.1

TLS encryption

TCP RDT

TCP Cong. Contr.

transport

application

(b) HTTP/2 with QUIC: no HOL blocking

TCP RDT

TCP Cong. Contr.

TLS encryption

error!

HTTP

GET

HTTP

GET

HTTP

GET

QUIC Cong. Cont.

QUIC

encrypt

QUIC

RDT

QUIC

RDT

QUIC

RDT

QUIC

encrypt

QUIC

encrypt

UDP

UDP

QUIC Cong. Cont.

QUIC

encrypt

QUIC

RDT

QUIC

RDT

QUIC

RDT

QUIC

encrypt

QUIC

encrypt

error!

HTTP

GET

HTTP

GET

HTTP

GET