TCP Congestion Control and Reinforcement Learning
CSE331 - Computer Networks Lab
Guided by: Prof. Jitendra Bhatia
Prayag Savsani - AU1841035
Yashraj Kakkad - AU1841036
Kaushal Patil - AU1841040
01
02
03
04
Introduction
Description of Project and Work done.
NS3, Testing CC algorithms
Details about NS3 testing and Topologies.
RL for CC
Details about our implementation of RL Agent for CC.
Results, Graphs and Conclusion
Produced results, Graphs and their conclusions
2
Introduction to TCP Congestion Control
3
Project Objectives
Applying CC to different Topologies
Testing different CC algorithms
RL
Comparing
Testing
Topologies
Creating a RL agent capable of regulating Congestion Window
Comparing RL Agent with TCP NewReno
4
ns3 Simulation of TCP Congestion Control
Simulating a topology and analysing variations in congestion window and throughput.
5
What is ns-3?
6
Our simulated topology in ns3
7
Key features of the simulation:
TCP NewReno Simulation
8
Variation in Congestion Window Sizes:
Aggregate Throughput:
Number of Packets Sent | 10000 |
Size of a Packet | 1KB |
Max Simulation Time | 100 seconds |
Reinforcement Learning and Congestion Control
Getting a DQN Agent to control the Congestion window based on Network parameters.
9
What is DQN? What is an Agent?
10
Environment
Actions
Obs
Agent
Rewards
Environment and Actions
Observable Parameters
Possible Actions
*We used TimeBasedTCPEnvironment from NS3Gym
11
Environment
Our Approach to RL Agent
Segments Ack in RTT(Sim Time)
Rewarding the Neural Network based on number of segments acked in one step and sim time for that step.
Maximising throughput
Rewarding the Neural Network solely based on the changes to throughput observed.
12
Segments Ack in RTT(Sim Time)
Pts Rx: 5900
Pts Rx: 5367
*axis scales different because different environments used for simulation
For TCP RL Agent
For TCP NewReno on RL Env
13
Maximising throughput
*test graphs are of saved image of online model
Pts Rx: 5900
14
Interesting Results :
Self Regulating Behaviour
During testing for various parameters we found that the network sometimes gets expert at self regulating transmitting more than 6000 packets in one episode.
This ability however came with instability on the static model being trained and led to catastrophic forgetting or overfitting.
15
Interesting Results :
Online Learning and ability of Transfer learning
We tried training a pretrained model and saw that it was capable of achieving even better self regulating results, the sample shown achieves maximum throughput almost instantly.
In context of CC, a model that's being constantly trained on link parameters would perform better than a static saved model, our training script can handle this OOB.
16
Combining our learnings to propose a hypothetical model that might be used in future for CC based on deep learning.
Achievements of our model : Self regulating behaviour, online learning possible, considers more parameters in picture, can maximise throughput
Drawbacks of our model : Instability in training due to inherent properties of NNs, requires action space to be set based on maximum cWnd size, static image of model might be overfit
Possible Solutions : RL based CC model combined with a deterministic model such as Tahoe ,Cubic or Reno, that would bring stability to the actions taken by the agent.
A possible presence of a critic model that can choose from actions given by RL Agent vs Action given by deterministic algorithm. All this with Online Learning. ( adding G/TPUs to routers? ).
17
RL Based Agent
Deterministics Algorithm
Hypothetical Architecture
Boot up : Use of deterministic A. to find action space for agent, based on slow start.
Working : Get actions from both RL Agent and deterministic A. and decide using a Critic Network
18
Start up
Deterministic Algorithm
First 3DA or Packet Loss
Action Space
Work phase for X time, then reboot with slow start from then present ssThresh.
RL based Agent
Obs
RL based Agent
Deterministic Algorithm
Critic Network
Actions(+,-,0)
Rewards
Comparison between non-RL and RL algorithms
Used the same topology that was used in non-RL simulations for RL-based simulations
19
20
Variation in Congestion Window Sizes:
Number of Packets Sent | 10000 |
Size of a Packet | 1KB |
Max Simulation Time | 100 seconds |
Our Team
Kaushal Patil
Yashraj Kakkad
AU1841036
AU1841040
Prayag Savsani
AU1841o35
21
Thanks
22