1 of 22

TCP Congestion Control and Reinforcement Learning

CSE331 - Computer Networks Lab

Guided by: Prof. Jitendra Bhatia

Prayag Savsani - AU1841035

Yashraj Kakkad - AU1841036

Kaushal Patil - AU1841040

2 of 22

01

02

03

04

Introduction

Description of Project and Work done.

NS3, Testing CC algorithms

Details about NS3 testing and Topologies.

RL for CC

Details about our implementation of RL Agent for CC.

Results, Graphs and Conclusion

Produced results, Graphs and their conclusions

2

3 of 22

Introduction to TCP Congestion Control

  • A network is said to be congested if too many packets are contending for the same link, and they have to be dropped.
  • Congestion control helps in preventing senders from overflowing the network.
  • TCP supports various mechanisms for controlling congestion.

3

4 of 22

Project Objectives

Applying CC to different Topologies

Testing different CC algorithms

RL

Comparing

Testing

Topologies

Creating a RL agent capable of regulating Congestion Window

Comparing RL Agent with TCP NewReno

4

5 of 22

ns3 Simulation of TCP Congestion Control

Simulating a topology and analysing variations in congestion window and throughput.

5

6 of 22

What is ns-3?

6

  • a discrete-event network simulator
  • targeted primarily for research and educational use.
  • free software, licensed under the GNU GPLv2 license

7 of 22

Our simulated topology in ns3

7

Key features of the simulation:

  • Rate error model introduced to induce retransmissions.
  • Socket is interfaced and callback functions are used to measure congestion window.
  • Throughput is measured using FlowMonitor.

8 of 22

TCP NewReno Simulation

8

Variation in Congestion Window Sizes:

Aggregate Throughput:

Number of Packets Sent

10000

Size of a Packet

1KB

Max Simulation Time

100 seconds

9 of 22

Reinforcement Learning and Congestion Control

Getting a DQN Agent to control the Congestion window based on Network parameters.

9

10 of 22

What is DQN? What is an Agent?

10

Environment

Actions

Obs

Agent

Rewards

11 of 22

Environment and Actions

Observable Parameters

  • Unique Socket ID
  • TCP Env Type
  • Sim Time In us
  • Unique Node ID
  • Current Threshold
  • Current Contention Window Size
  • Segment Size
  • Bytes In Flight Sum
  • Bytes In Flight Avg
  • Segments Acked Sum
  • Segments Acked Avg
  • Avg Rtt
  • Min Rtt
  • Avg Inter Tx
  • Axg Inter Rx
  • Throughput

Possible Actions

  • Increase Cwnd by 5000
  • Decrease Cwnd by 1000
  • Keep Cwnd Same

*We used TimeBasedTCPEnvironment from NS3Gym

11

Environment

  • Simulation environment
  • Gives Observations(State of system)
  • Takes Actions
  • Gives Observations after performing given action.
  • Responsible for simulating network environment

12 of 22

Our Approach to RL Agent

Segments Ack in RTT(Sim Time)

Rewarding the Neural Network based on number of segments acked in one step and sim time for that step.

Maximising throughput

Rewarding the Neural Network solely based on the changes to throughput observed.

12

13 of 22

Segments Ack in RTT(Sim Time)

Pts Rx: 5900

Pts Rx: 5367

*axis scales different because different environments used for simulation

For TCP RL Agent

For TCP NewReno on RL Env

13

14 of 22

Maximising throughput

*test graphs are of saved image of online model

Pts Rx: 5900

14

15 of 22

Interesting Results :

Self Regulating Behaviour

During testing for various parameters we found that the network sometimes gets expert at self regulating transmitting more than 6000 packets in one episode.

This ability however came with instability on the static model being trained and led to catastrophic forgetting or overfitting.

15

16 of 22

Interesting Results :

Online Learning and ability of Transfer learning

We tried training a pretrained model and saw that it was capable of achieving even better self regulating results, the sample shown achieves maximum throughput almost instantly.

In context of CC, a model that's being constantly trained on link parameters would perform better than a static saved model, our training script can handle this OOB.

16

17 of 22

Combining our learnings to propose a hypothetical model that might be used in future for CC based on deep learning.

Achievements of our model : Self regulating behaviour, online learning possible, considers more parameters in picture, can maximise throughput

Drawbacks of our model : Instability in training due to inherent properties of NNs, requires action space to be set based on maximum cWnd size, static image of model might be overfit

Possible Solutions : RL based CC model combined with a deterministic model such as Tahoe ,Cubic or Reno, that would bring stability to the actions taken by the agent.

A possible presence of a critic model that can choose from actions given by RL Agent vs Action given by deterministic algorithm. All this with Online Learning. ( adding G/TPUs to routers? ).

17

RL Based Agent

Deterministics Algorithm

18 of 22

Hypothetical Architecture

Boot up : Use of deterministic A. to find action space for agent, based on slow start.

Working : Get actions from both RL Agent and deterministic A. and decide using a Critic Network

18

Start up

Deterministic Algorithm

First 3DA or Packet Loss

Action Space

Work phase for X time, then reboot with slow start from then present ssThresh.

RL based Agent

Obs

RL based Agent

Deterministic Algorithm

Critic Network

Actions(+,-,0)

Rewards

19 of 22

Comparison between non-RL and RL algorithms

Used the same topology that was used in non-RL simulations for RL-based simulations

19

20 of 22

20

Variation in Congestion Window Sizes:

Number of Packets Sent

10000

Size of a Packet

1KB

Max Simulation Time

100 seconds

21 of 22

Our Team

Kaushal Patil

Yashraj Kakkad

AU1841036

AU1841040

Prayag Savsani

AU1841o35

21

22 of 22

Thanks

22