1 of 22

TCP Congestion Control and Reinforcement Learning

CSE331 - Computer Networks Lab

Guided by: Prof. Jitendra Bhatia

Prayag Savsani - AU1841035

Yashraj Kakkad - AU1841036

Kaushal Patil - AU1841040

2 of 22

Introduction

Description of Project and Work done.

NS3, Testing CC algorithms

Details about NS3 testing and Topologies.

RL for CC

Details about our implementation of RL Agent for CC.

Results, Graphs and Conclusion

Produced results, Graphs and their conclusions

3 of 22

Introduction to TCP Congestion Control

A network is said to be congested if too many packets are contending for the same link, and they have to be dropped.
Congestion control helps in preventing senders from overflowing the network.
TCP supports various mechanisms for controlling congestion.

4 of 22

Project Objectives

Applying CC to different Topologies

Testing different CC algorithms

Comparing

Testing

Topologies

Creating a RL agent capable of regulating Congestion Window

Comparing RL Agent with TCP NewReno

5 of 22

ns3 Simulation of TCP Congestion Control

Simulating a topology and analysing variations in congestion window and throughput.

6 of 22

What is ns-3?

a discrete-event network simulator
targeted primarily for research and educational use.
free software, licensed under the GNU GPLv2 license

7 of 22

Our simulated topology in ns3

Key features of the simulation:

Rate error model introduced to induce retransmissions.
Socket is interfaced and callback functions are used to measure congestion window.
Throughput is measured using FlowMonitor.

8 of 22

TCP NewReno Simulation

Variation in Congestion Window Sizes:

Aggregate Throughput:

Number of Packets Sent	10000
Size of a Packet	1KB
Max Simulation Time	100 seconds

9 of 22

Reinforcement Learning and Congestion Control

Getting a DQN Agent to control the Congestion window based on Network parameters.

10 of 22

What is DQN? What is an Agent?

Environment

Actions

Obs

Agent

Rewards

11 of 22

Environment and Actions

Observable Parameters

Unique Socket ID
TCP Env Type
Sim Time In us
Unique Node ID
Current Threshold
Current Contention Window Size
Segment Size
Bytes In Flight Sum
Bytes In Flight Avg
Segments Acked Sum
Segments Acked Avg
Avg Rtt
Min Rtt
Avg Inter Tx
Axg Inter Rx
Throughput

Possible Actions

Increase Cwnd by 5000
Decrease Cwnd by 1000
Keep Cwnd Same

*We used TimeBasedTCPEnvironment from NS3Gym

Environment

Simulation environment
Gives Observations(State of system)
Takes Actions
Gives Observations after performing given action.
Responsible for simulating network environment

12 of 22

Our Approach to RL Agent

Segments Ack in RTT(Sim Time)

Rewarding the Neural Network based on number of segments acked in one step and sim time for that step.

Maximising throughput

Rewarding the Neural Network solely based on the changes to throughput observed.

13 of 22

Segments Ack in RTT(Sim Time)

Pts Rx: 5900

Pts Rx: 5367

*axis scales different because different environments used for simulation

For TCP RL Agent

For TCP NewReno on RL Env

14 of 22

Maximising throughput

*test graphs are of saved image of online model

Pts Rx: 5900

15 of 22

Interesting Results :

Self Regulating Behaviour

During testing for various parameters we found that the network sometimes gets expert at self regulating transmitting more than 6000 packets in one episode.

This ability however came with instability on the static model being trained and led to catastrophic forgetting or overfitting.

16 of 22

Interesting Results :

Online Learning and ability of Transfer learning

We tried training a pretrained model and saw that it was capable of achieving even better self regulating results, the sample shown achieves maximum throughput almost instantly.

In context of CC, a model that's being constantly trained on link parameters would perform better than a static saved model, our training script can handle this OOB.

17 of 22

Combining our learnings to propose a hypothetical model that might be used in future for CC based on deep learning.

Achievements of our model : Self regulating behaviour, online learning possible, considers more parameters in picture, can maximise throughput

Drawbacks of our model : Instability in training due to inherent properties of NNs, requires action space to be set based on maximum cWnd size, static image of model might be overfit

Possible Solutions : RL based CC model combined with a deterministic model such as Tahoe ,Cubic or Reno, that would bring stability to the actions taken by the agent.

A possible presence of a critic model that can choose from actions given by RL Agent vs Action given by deterministic algorithm. All this with Online Learning. ( adding G/TPUs to routers? ).

RL Based Agent

Deterministics Algorithm

18 of 22

Hypothetical Architecture

Boot up : Use of deterministic A. to find action space for agent, based on slow start.

Working : Get actions from both RL Agent and deterministic A. and decide using a Critic Network

Start up

Deterministic Algorithm

First 3DA or Packet Loss

Action Space

Work phase for X time, then reboot with slow start from then present ssThresh.

RL based Agent

Obs

RL based Agent

Deterministic Algorithm

Critic Network

Actions(+,-,0)

Rewards

19 of 22

Comparison between non-RL and RL algorithms

Used the same topology that was used in non-RL simulations for RL-based simulations

20 of 22