1 of 144

The Network Layer

UNIT-4

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

By

G. Fayaz Hussain

Assistant Professor

Department of CSE

Ravindra College of Engineering for Women

Kurnool – 518452, Andhra Pradesh, India

2 of 144

Unit 4: Network layer

chapter goals:

  • understand principles behind network layer services:
    • network layer service models
    • forwarding versus routing
    • how a router works
    • routing (path selection)
    • broadcast, multicast
  • instantiation, implementation in the Internet

Network Layer

4-2

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

3 of 144

4.1 introduction

4.2 virtual circuit and datagram networks

4.3 IP: Internet Protocol

    • datagram format
    • IPv4 addressing
    • ICMP
    • IPv6

4.4 routing algorithms

    • link state
    • distance vector
    • hierarchical routing

4.5 routing in the Internet

    • RIP
    • OSPF
    • BGP

4.6 broadcast and multicast routing

Network Layer

4-3

Unit 4: outline

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

4 of 144

Network layer

  • transport segment from sending to receiving host
  • on sending side encapsulates segments into datagrams
  • on receiving side, delivers segments to transport layer
  • network layer protocols in every host, router
  • router examines header fields in all IP datagrams passing through it

Network Layer

4-4

application

transport

network

data link

physical

application

transport

network

data link

physical

network

data link

physical

network

data link

physical

network

data link

physical

network

data link

physical

network

data link

physical

network

data link

physical

network

data link

physical

network

data link

physical

network

data link

physical

network

data link

physical

network

data link

physical

5 of 144

Two key network-layer functions

  • forwarding: move packets from router’s input to appropriate router output
  • routing: determine route taken by packets from source to dest.
    • routing algorithms

Network Layer

4-5

analogy:

  • routing: process of planning trip from source to dest
  • forwarding: process of getting through single interchange

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

6 of 144

Network Layer

4-6

1

2

3

0111

value in arriving

packet’s header

routing algorithm

local forwarding table

header value

output link

0100

0101

0111

1001

3

2

2

1

Interplay between routing and forwarding

routing algorithm determines

end-end-path through network

forwarding table determines

local forwarding at this router

7 of 144

Connection setup

  • 3rd important function in some network architectures:
    • ATM, frame relay, X.25
  • before datagrams flow, two end hosts and intervening routers establish virtual connection
    • routers get involved
  • network vs transport layer connection service:
    • network: between two hosts (may also involve intervening routers in case of VCs)
    • transport: between two processes

Network Layer

4-7

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

8 of 144

Network service model

example services for individual datagrams:

  • guaranteed delivery
  • guaranteed delivery with less than 40 msec delay

example services for a flow of datagrams:

  • in-order datagram delivery
  • guaranteed minimum bandwidth to flow
  • restrictions on changes in inter-packet spacing

Network Layer

4-8

Q: What service model for “channel” transporting datagrams from sender to receiver?

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

9 of 144

Network layer service models:

Network Layer

4-9

Network

Architecture

Internet

ATM

ATM

ATM

ATM

Service

Model

best effort

CBR

VBR

ABR

UBR

Bandwidth

none

constant

rate

guaranteed

rate

guaranteed

minimum

none

Loss

no

yes

yes

no

no

Order

no

yes

yes

yes

yes

Timing

no

yes

yes

no

no

Congestion

feedback

no (inferred

via loss)

no

congestion

no

congestion

yes

no

Guarantees ?

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

10 of 144

4.1 introduction

4.2 virtual circuit and datagram networks

4.3 what’s inside a router

4.4 IP: Internet Protocol

    • datagram format
    • IPv4 addressing
    • ICMP
    • IPv6

4.5 routing algorithms

    • link state
    • distance vector
    • hierarchical routing

4.6 routing in the Internet

    • RIP
    • OSPF
    • BGP

4.7 broadcast and multicast routing

Network Layer

4-10

Chapter 4: outline

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

11 of 144

Connection, connection-less service

  • datagram network provides network-layer connectionless service
  • virtual-circuit network provides network-layer connection service
  • analogous to TCP/UDP connecton-oriented / connectionless transport-layer services, but:
    • service: host-to-host
    • no choice: network provides one or the other
    • implementation: in network core

Network Layer

4-11

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

12 of 144

Virtual circuits

  • call setup, teardown for each call before data can flow
  • each packet carries VC identifier (not destination host address)
  • every router on source-dest path maintains “state” for each passing connection
  • link, router resources (bandwidth, buffers) may be allocated to VC (dedicated resources = predictable service)

“source-to-dest path behaves much like telephone circuit”

    • performance-wise
    • network actions along source-to-dest path

Network Layer

4-12

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

13 of 144

VC implementation

a VC consists of:

    • path from source to destination
    • VC numbers, one number for each link along path
    • entries in forwarding tables in routers along path
  1. packet belonging to VC carries VC number (rather than dest address)
  2. VC number can be changed on each link.
    • new VC number comes from forwarding table

Network Layer

4-13

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

14 of 144

VC forwarding table

Network Layer

4-14

12

22

32

1

2

3

VC number

interface

number

Incoming interface Incoming VC # Outgoing interface Outgoing VC #

1 12 3 22

2 63 1 18

3 7 2 17

1 97 3 87

… … … …

forwarding table in

northwest router:

VC routers maintain connection state information!

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

15 of 144

Virtual circuits: signaling protocols

  • used to setup, maintain teardown VC
  • used in ATM, frame-relay, X.25
  • not used in today’s Internet

Network Layer

4-15

application

transport

network

data link

physical

1. initiate call

2. incoming call

3. accept call

4. call connected

5. data flow begins

6. receive data

application

transport

network

data link

physical

16 of 144

Datagram networks

  • no call setup at network layer
  • routers: no state about end-to-end connections
    • no network-level concept of “connection”
  • packets forwarded using destination host address

Network Layer

4-16

1. send datagrams

application

transport

network

data link

physical

application

transport

network

data link

physical

2. receive datagrams

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

17 of 144

Datagram forwarding table

Network Layer

4-17

1

2

3

IP destination address in

arriving packet’s header

routing algorithm

local forwarding table

dest address

output link

address-range 1

address-range 2

address-range 3

address-range 4

3

2

2

1

4 billion IP addresses, so rather than list individual destination address

list range of addresses

(aggregate table entries)

18 of 144

Datagram forwarding table

Network Layer

4-18

Destination Address Range

11001000 00010111 00010000 00000000

through

11001000 00010111 00010111 11111111

11001000 00010111 00011000 00000000

through

11001000 00010111 00011000 11111111

11001000 00010111 00011001 00000000

through

11001000 00010111 00011111 11111111

otherwise

Link Interface

0

1

2

3

Q: but what happens if ranges don’t divide up so nicely?

19 of 144

Longest prefix matching

Network Layer

4-19

Destination Address Range

11001000 00010111 00010*** *********

11001000 00010111 00011000 *********

11001000 00010111 00011*** *********

otherwise

DA: 11001000 00010111 00011000 10101010

examples:

DA: 11001000 00010111 00010110 10100001

which interface?

which interface?

when looking for forwarding table entry for given destination address, use longest address prefix that matches destination address.

longest prefix matching

Link interface

0

1

2

3

20 of 144

Datagram or VC network: why?

Internet (datagram)

  • data exchange among computers
    • “elastic” service, no strict timing req.
  • many link types
    • different characteristics
    • uniform service difficult
  • “smart” end systems (computers)
    • can adapt, perform control, error recovery
    • simple inside network, complexity at “edge”

ATM (VC)

  • evolved from telephony
  • human conversation:
    • strict timing, reliability requirements
    • need for guaranteed service
  • “dumb” end systems
    • telephones
    • complexity inside network

Network Layer

4-20

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

21 of 144

4.1 introduction

4.2 virtual circuit and datagram networks

4.3 IP: Internet Protocol

    • datagram format
    • IPv4 addressing
    • ICMP
    • IPv6

4.4 routing algorithms

    • link state
    • distance vector
    • hierarchical routing

4.5 routing in the Internet

    • RIP
    • OSPF
    • BGP

4.6 broadcast and multicast routing

Network Layer

4-21

Chapter 4: outline

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

22 of 144

The Internet network layer

host, router network layer functions:

Network Layer

4-22

forwarding

table

routing protocols

  • path selection
  • RIP, OSPF, BGP

IP protocol

  • addressing conventions
  • datagram format
  • packet handling conventions

ICMP protocol

  • error reporting
  • router “signaling”

transport layer: TCP, UDP

link layer

physical layer

network

layer

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

23 of 144

IP datagram format

Network Layer

4-23

ver

length

32 bits

data

(variable length,

typically a TCP

or UDP segment)

16-bit identifier

header

checksum

time to

live

32 bit source IP address

head.

len

type of

service

flgs

fragment

offset

upper

layer

32 bit destination IP address

options (if any)

IP protocol version

number

header length

(bytes)

upper layer protocol

to deliver payload to

total datagram

length (bytes)

“type” of data

for

fragmentation/

reassembly

max number

remaining hops

(decremented at

each router)

e.g. timestamp,

record route

taken, specify

list of routers

to visit.

how much overhead?

  • 20 bytes of TCP
  • 20 bytes of IP
  • = 40 bytes + app layer overhead

24 of 144

IP fragmentation, reassembly

  • network links have MTU (max.transfer size) - largest possible link-level frame
    • different link types, different MTUs
  • large IP datagram divided (“fragmented”) within net
    • one datagram becomes several datagrams
    • “reassembled” only at final destination
    • IP header bits used to identify, order related fragments

Network Layer

4-24

fragmentation:

in: one large datagram

out: 3 smaller datagrams

reassembly

25 of 144

4.1 introduction

4.2 virtual circuit and datagram networks

4.3 IP: Internet Protocol

    • datagram format
    • IPv4 addressing
    • ICMP
    • IPv6

4.4 routing algorithms

    • link state
    • distance vector
    • hierarchical routing

4.5 routing in the Internet

    • RIP
    • OSPF
    • BGP

4.6 broadcast and multicast routing

Network Layer

4-25

Chapter 4: outline

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

26 of 144

IP addressing: introduction

  • IP address: 32-bit identifier for host, router interface
  • interface: connection between host/router and physical link
    • router’s typically have multiple interfaces
    • host typically has one or two interfaces (e.g., wired Ethernet, wireless 802.11)
  • IP addresses associated with each interface

Network Layer

4-26

223.1.1.1

223.1.1.2

223.1.1.3

223.1.1.4

223.1.2.9

223.1.2.2

223.1.2.1

223.1.3.2

223.1.3.1

223.1.3.27

223.1.1.1 = 11011111 00000001 00000001 00000001

223

1

1

1

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

27 of 144

IP addressing: introduction

Q: how are interfaces actually connected?

A: we’ll learn about that in chapter 5, 6.

Network Layer

4-27

223.1.1.1

223.1.1.2

223.1.1.3

223.1.1.4

223.1.2.9

223.1.2.2

223.1.2.1

223.1.3.2

223.1.3.1

223.1.3.27

A: wired Ethernet interfaces connected by Ethernet switches

A: wireless WiFi interfaces connected by WiFi base station

For now: don’t need to worry about how one interface is connected to another (with no intervening router)

28 of 144

Subnets

  • IP address:
    • subnet part - high order bits
    • host part - low order bits
  • what’s a subnet ?
    • device interfaces with same subnet part of IP address
    • can physically reach each other without intervening router

Network Layer

4-28

network consisting of 3 subnets

223.1.1.1

223.1.1.3

223.1.1.4

223.1.2.9

223.1.3.2

223.1.3.1

subnet

223.1.1.2

223.1.3.27

223.1.2.2

223.1.2.1

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

29 of 144

Subnets

recipe

  • to determine the subnets, detach each interface from its host or router, creating islands of isolated networks
  • each isolated network is called a subnet

Network Layer

4-29

subnet mask: /24

223.1.1.0/24

223.1.2.0/24

223.1.3.0/24

223.1.1.1

223.1.1.3

223.1.1.4

223.1.2.9

223.1.3.2

223.1.3.1

subnet

223.1.1.2

223.1.3.27

223.1.2.2

223.1.2.1

30 of 144

Subnets

how many?

Network Layer

4-30

223.1.1.1

223.1.1.3

223.1.1.4

223.1.2.2

223.1.2.1

223.1.2.6

223.1.3.2

223.1.3.1

223.1.3.27

223.1.1.2

223.1.7.0

223.1.7.1

223.1.8.0

223.1.8.1

223.1.9.1

223.1.9.2

31 of 144

IP addressing: CIDR

CIDR: Classless InterDomain Routing

    • subnet portion of address of arbitrary length
    • address format: a.b.c.d/x, where x is # bits in subnet portion of address

Network Layer

4-31

11001000 00010111 00010000 00000000

subnet

part

host

part

200.23.16.0/23

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

32 of 144

IP addresses: how to get one?

Q: How does a host get IP address?

  • hard-coded by system admin in a file
    • Windows: control-panel->network->configuration->tcp/ip->properties
    • UNIX: /etc/rc.config
  • DHCP: Dynamic Host Configuration Protocol: dynamically get address from as server
    • “plug-and-play

Network Layer

4-32

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

33 of 144

DHCP: Dynamic Host Configuration Protocol

goal: allow host to dynamically obtain its IP address from network server when it joins network

    • can renew its lease on address in use
    • allows reuse of addresses (only hold address while connected/“on”)
    • support for mobile users who want to join network (more shortly)

DHCP overview:

    • host broadcasts “DHCP discover” msg [optional]
    • DHCP server responds with “DHCP offer” msg [optional]
    • host requests IP address: “DHCP request” msg
    • DHCP server sends address: “DHCP ack” msg

Network Layer

4-33

34 of 144

DHCP client-server scenario

Network Layer

4-34

223.1.1.0/24

223.1.2.0/24

223.1.3.0/24

223.1.1.1

223.1.1.3

223.1.1.4

223.1.2.9

223.1.3.2

223.1.3.1

223.1.1.2

223.1.3.27

223.1.2.2

223.1.2.1

DHCP

server

arriving DHCP

client needs

address in this

network

35 of 144

DHCP client-server scenario

Network Layer

4-35

DHCP server: 223.1.2.5

arriving

client

DHCP discover

src : 0.0.0.0, 68

dest.: 255.255.255.255,67

yiaddr: 0.0.0.0

transaction ID: 654

DHCP offer

src: 223.1.2.5, 67

dest: 255.255.255.255, 68

yiaddrr: 223.1.2.4

transaction ID: 654

lifetime: 3600 secs

DHCP request

src: 0.0.0.0, 68

dest:: 255.255.255.255, 67

yiaddrr: 223.1.2.4

transaction ID: 655

lifetime: 3600 secs

DHCP ACK

src: 223.1.2.5, 67

dest: 255.255.255.255, 68

yiaddrr: 223.1.2.4

transaction ID: 655

lifetime: 3600 secs

Broadcast: is there a DHCP server out there?

Broadcast: I’m a DHCP server! Here’s an IP address you can use

Broadcast: OK. I’ll take that IP address!

Broadcast: OK. You’ve got that IP address!

36 of 144

DHCP: more than IP addresses

DHCP can return more than just allocated IP address on subnet:

    • address of first-hop router for client
    • name and IP address of DNS sever
    • network mask (indicating network versus host portion of address)

Network Layer

4-36

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

37 of 144

DHCP: example

  • connecting laptop needs its IP address, addr of first-hop router, addr of DNS server: use DHCP

Network Layer

4-37

router with DHCP

server built into

router

  • DHCP request encapsulated in UDP, encapsulated in IP, encapsulated in 802.1 Ethernet

  • Ethernet frame broadcast (dest: FFFFFFFFFFFF) on LAN, received at router running DHCP server
  • Ethernet demuxed to IP demuxed, UDP demuxed to DHCP

168.1.1.1

DHCP

UDP

IP

Eth

Phy

DHCP

DHCP

DHCP

DHCP

DHCP

DHCP

UDP

IP

Eth

Phy

DHCP

DHCP

DHCP

DHCP

DHCP

38 of 144

DHCP: example

  • DCP server formulates DHCP ACK containing client’s IP address, IP address of first-hop router for client, name & IP address of DNS server

Network Layer

4-38

  • encapsulation of DHCP server, frame forwarded to client, demuxing up to DHCP at client

router with DHCP

server built into

router

DHCP

DHCP

DHCP

DHCP

DHCP

UDP

IP

Eth

Phy

DHCP

DHCP

UDP

IP

Eth

Phy

DHCP

DHCP

DHCP

DHCP

  • client now knows its IP address, name and IP address of DSN server, IP address of its first-hop router

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

39 of 144

IP addresses: how to get one?

Q: how does network get subnet part of IP addr?

A: gets allocated portion of its provider ISP’s address space

Network Layer

4-39

ISP's block 11001000 00010111 00010000 00000000 200.23.16.0/20

Organization 0 11001000 00010111 00010000 00000000 200.23.16.0/23

Organization 1 11001000 00010111 00010010 00000000 200.23.18.0/23

Organization 2 11001000 00010111 00010100 00000000 200.23.20.0/23

... ….. …. ….

Organization 7 11001000 00010111 00011110 00000000 200.23.30.0/23

40 of 144

Hierarchical addressing: route aggregation

Network Layer

4-40

“Send me anything

with addresses

beginning

200.23.16.0/20”

200.23.16.0/23

200.23.18.0/23

200.23.30.0/23

Fly-By-Night-ISP

Organization 0

Organization 7

Internet

Organization 1

ISPs-R-Us

“Send me anything

with addresses

beginning

199.31.0.0/16”

200.23.20.0/23

Organization 2

.

.

.

.

.

.

hierarchical addressing allows efficient advertisement of routing

information:

41 of 144

Hierarchical addressing: more specific routes

Network Layer

4-41

ISPs-R-Us has a more specific route to Organization 1

“Send me anything

with addresses

beginning

200.23.16.0/20

200.23.16.0/23

200.23.18.0/23

200.23.30.0/23

Fly-By-Night-ISP

Organization 0

Organization 7

Internet

Organization 1

ISPs-R-Us

“Send me anything

with addresses

beginning 199.31.0.0/16

or 200.23.18.0/23

200.23.20.0/23

Organization 2

.

.

.

.

.

.

42 of 144

IP addressing: the last word...

Q: how does an ISP get block of addresses?

A: ICANN: Internet Corporation for Assigned

Names and Numbers http://www.icann.org/

    • allocates addresses
    • manages DNS
    • assigns domain names, resolves disputes

Network Layer

4-42

43 of 144

NAT: network address translation

Network Layer

4-43

10.0.0.1

10.0.0.2

10.0.0.3

10.0.0.4

138.76.29.7

local network

(e.g., home network)

10.0.0/24

rest of

Internet

datagrams with source or

destination in this network

have 10.0.0/24 address for

source, destination (as usual)

all datagrams leaving local

network have same single source NAT IP address: 138.76.29.7,different source port numbers

44 of 144

NAT: network address translation

motivation: local network uses just one IP address as far as outside world is concerned:

    • range of addresses not needed from ISP: just one IP address for all devices
    • can change addresses of devices in local network without notifying outside world
    • can change ISP without changing addresses of devices in local network
    • devices inside local net not explicitly addressable, visible by outside world (a security plus)

Network Layer

4-44

45 of 144

NAT: network address translation

implementation: NAT router must:�

    • outgoing datagrams: replace (source IP address, port #) of every outgoing datagram to (NAT IP address, new port #)

. . . remote clients/servers will respond using (NAT IP address, new port #) as destination addr�

    • remember (in NAT translation table) every (source IP address, port #) to (NAT IP address, new port #) translation pair�
    • incoming datagrams: replace (NAT IP address, new port #) in dest fields of every incoming datagram with corresponding (source IP address, port #) stored in NAT table

Network Layer

4-45

46 of 144

NAT: network address translation

Network Layer

4-46

10.0.0.1

10.0.0.2

10.0.0.3

S: 10.0.0.1, 3345

D: 128.119.40.186, 80

1

10.0.0.4

138.76.29.7

1: host 10.0.0.1

sends datagram to

128.119.40.186, 80

NAT translation table

WAN side addr LAN side addr

138.76.29.7, 5001 10.0.0.1, 3345

…… ……

S: 128.119.40.186, 80

D: 10.0.0.1, 3345

4

S: 138.76.29.7, 5001

D: 128.119.40.186, 80

2

2: NAT router

changes datagram

source addr from

10.0.0.1, 3345 to

138.76.29.7, 5001,

updates table

S: 128.119.40.186, 80

D: 138.76.29.7, 5001

3

3: reply arrives

dest. address:

138.76.29.7, 5001

4: NAT router

changes datagram

dest addr from

138.76.29.7, 5001 to 10.0.0.1, 3345

47 of 144

NAT: network address translation

  • 16-bit port-number field:
    • 60,000 simultaneous connections with a single LAN-side address!
  • NAT is controversial:
    • routers should only process up to layer 3
    • violates end-to-end argument
      • NAT possibility must be taken into account by app designers, e.g., P2P applications
    • address shortage should instead be solved by IPv6

Network Layer

4-47

48 of 144

NAT traversal problem

  • client wants to connect to server with address 10.0.0.1
    • server address 10.0.0.1 local to LAN (client can’t use it as destination addr)
    • only one externally visible NATed address: 138.76.29.7
  • solution1: statically configure NAT to forward incoming connection requests at given port to server
    • e.g., (123.76.29.7, port 2500) always forwarded to 10.0.0.1 port 25000

Network Layer

4-48

10.0.0.1

10.0.0.4

NAT

router

138.76.29.7

client

?

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

49 of 144

NAT traversal problem

  • solution 2: Universal Plug and Play (UPnP) Internet Gateway Device (IGD) Protocol. Allows NATed host to:
    • learn public IP address (138.76.29.7)
    • add/remove port mappings (with lease times)

i.e., automate static NAT port map configuration

Network Layer

4-49

10.0.0.1

NAT

router

IGD

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

50 of 144

NAT traversal problem

  • solution 3: relaying (used in Skype)
    • NATed client establishes connection to relay
    • external client connects to relay
    • relay bridges packets between to connections

Network Layer

4-50

138.76.29.7

client

1. connection to

relay initiated

by NATed host

2. connection to

relay initiated

by client

3. relaying

established

NAT

router

10.0.0.1

51 of 144

4.1 introduction

4.2 virtual circuit and datagram networks

4.3 what’s inside a router

4.4 IP: Internet Protocol

    • datagram format
    • IPv4 addressing
    • ICMP
    • IPv6

4.5 routing algorithms

    • link state
    • distance vector
    • hierarchical routing

4.6 routing in the Internet

    • RIP
    • OSPF
    • BGP

4.7 broadcast and multicast routing

Network Layer

4-51

Chapter 4: outline

52 of 144

ICMP: internet control message protocol

  • used by hosts & routers to communicate network-level information
    • error reporting: unreachable host, network, port, protocol
    • echo request/reply (used by ping)
  • network-layer “above” IP:
    • ICMP msgs carried in IP datagrams
  • ICMP message: type, code plus first 8 bytes of IP datagram causing error

Network Layer

4-52

Type Code description

0 0 echo reply (ping)

3 0 dest. network unreachable

3 1 dest host unreachable

3 2 dest protocol unreachable

3 3 dest port unreachable

3 6 dest network unknown

3 7 dest host unknown

4 0 source quench (congestion

control - not used)

8 0 echo request (ping)

9 0 route advertisement

10 0 router discovery

11 0 TTL expired

12 0 bad IP header

53 of 144

Traceroute and ICMP

  • source sends series of UDP segments to dest
    • first set has TTL =1
    • second set has TTL=2, etc.
    • unlikely port number
  • when nth set of datagrams arrives to nth router:
    • router discards datagrams
    • and sends source ICMP messages (type 11, code 0)
    • ICMP messages includes name of router & IP address
  • when ICMP messages arrives, source records RTTs

Network Layer

4-53

stopping criteria:

  • UDP segment eventually arrives at destination host
  • destination returns ICMP “port unreachable” message (type 3, code 3)
  • source stops

3 probes

3 probes

3 probes

54 of 144

IPv6: motivation

  • initial motivation: 32-bit address space soon to be completely allocated.
  • additional motivation:
    • header format helps speed processing/forwarding
    • header changes to facilitate QoS

IPv6 datagram format:

    • fixed-length 40 byte header
    • no fragmentation allowed

Network Layer

4-54

55 of 144

IPv6 datagram format

Network Layer

4-55

priority: identify priority among datagrams in flow

flow Label: identify datagrams in same “flow.”

(concept of“flow” not well defined).

next header: identify upper layer protocol for data

data

destination address

(128 bits)

source address

(128 bits)

payload len

next hdr

hop limit

flow label

pri

ver

32 bits

56 of 144

Other changes from IPv4

  • checksum: removed entirely to reduce processing time at each hop
  • options: allowed, but outside of header, indicated by “Next Header” field
  • ICMPv6: new version of ICMP
    • additional message types, e.g. “Packet Too Big”
    • multicast group management functions

Network Layer

4-56

57 of 144

Transition from IPv4 to IPv6

  • not all routers can be upgraded simultaneously
    • no “flag days”
    • how will network operate with mixed IPv4 and IPv6 routers?
  • tunneling: IPv6 datagram carried as payload in IPv4 datagram among IPv4 routers

Network Layer

4-57

IPv4 source, dest addr

IPv4 header fields

IPv4 datagram

IPv6 datagram

IPv4 payload

UDP/TCP payload

IPv6 source dest addr

IPv6 header fields

58 of 144

Tunneling

Network Layer

4-58

physical view:

IPv4

IPv4

A

B

IPv6

IPv6

E

IPv6

IPv6

F

C

D

logical view:

IPv4 tunnel

connecting IPv6 routers

E

IPv6

IPv6

F

A

B

IPv6

IPv6

59 of 144

Tunneling

Network Layer

4-59

flow: X

src: A

dest: F

data

A-to-B:

IPv6

Flow: X

Src: A

Dest: F

data

src:B

dest: E

B-to-C:

IPv6 inside

IPv4

E-to-F:

IPv6

flow: X

src: A

dest: F

data

B-to-C:

IPv6 inside

IPv4

Flow: X

Src: A

Dest: F

data

src:B

dest: E

physical view:

A

B

IPv6

IPv6

E

IPv6

IPv6

F

C

D

logical view:

IPv4 tunnel

connecting IPv6 routers

E

IPv6

IPv6

F

A

B

IPv6

IPv6

IPv4

IPv4

60 of 144

IPv6: adoption

  • US National Institutes of Standards estimate [2013]:
    • ~3% of industry IP routers
    • ~11% of US gov’t routers

  • Long (long!) time for deployment, use
    • 20 years and counting!
    • think of application-level changes in last 20 years: WWW, Facebook, …
    • Why?

Network Layer

4-60

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

61 of 144

4.1 introduction

4.2 virtual circuit and datagram networks

4.3 IP: Internet Protocol

    • datagram format
    • IPv4 addressing
    • ICMP
    • IPv6

4.4 routing algorithms

    • link state
    • distance vector
    • hierarchical routing

4.5 routing in the Internet

    • RIP
    • OSPF
    • BGP

4.6 broadcast and multicast routing

Network Layer

4-61

Chapter 4: outline

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

62 of 144

Interplay between routing, forwarding

Network Layer

4-62

1

2

3

IP destination address in

arriving packet’s header

routing algorithm

local forwarding table

dest address

output link

address-range 1

address-range 2

address-range 3

address-range 4

3

2

2

1

routing algorithm determines

end-end-path through network

forwarding table determines

local forwarding at this router

63 of 144

Graph abstraction

Network Layer

4-63

u

y

x

w

v

z

2

2

1

3

1

1

2

5

3

5

graph: G = (N,E)

N = set of routers = { u, v, w, x, y, z }

E = set of links ={ (u,v), (u,x), (v,x), (v,w), (x,w), (x,y), (w,y), (w,z), (y,z) }

aside: graph abstraction is useful in other network contexts, e.g.,

P2P, where N is set of peers and E is set of TCP connections

64 of 144

Graph abstraction: costs

Network Layer

4-64

u

y

x

w

v

z

2

2

1

3

1

1

2

5

3

5

c(x,x’) = cost of link (x,x’)

e.g., c(w,z) = 5

cost could always be 1, or

inversely related to bandwidth,

or inversely related to

congestion

cost of path (x1, x2, x3,…, xp) = c(x1,x2) + c(x2,x3) + … + c(xp-1,xp)

key question: what is the least-cost path between u and z ?

routing algorithm: algorithm that finds that least cost path

65 of 144

Routing algorithm classification

Q: global or decentralized information?

global:

  • all routers have complete topology, link cost info
  • “link state” algorithms

decentralized:

  • router knows physically-connected neighbors, link costs to neighbors
  • iterative process of computation, exchange of info with neighbors
  • “distance vector” algorithms

Q: static or dynamic?

static:

  • routes change slowly over time

dynamic:

  • routes change more quickly
    • periodic update
    • in response to link cost changes

Network Layer

4-65

66 of 144

4.1 introduction

4.2 virtual circuit and datagram networks

4.3 what’s inside a router

4.4 IP: Internet Protocol

    • datagram format
    • IPv4 addressing
    • ICMP
    • IPv6

4.5 routing algorithms

    • link state
    • distance vector
    • hierarchical routing

4.6 routing in the Internet

    • RIP
    • OSPF
    • BGP

4.7 broadcast and multicast routing

Network Layer

4-66

Chapter 4: outline

67 of 144

A Link-State Routing Algorithm

Dijkstra’s algorithm

  • net topology, link costs known to all nodes
    • accomplished via “link state broadcast”
    • all nodes have same info
  • computes least cost paths from one node (‘source”) to all other nodes
    • gives forwarding table for that node
  • iterative: after k iterations, know least cost path to k dest.’s

notation:

  • c(x,y): link cost from node x to y; = ∞ if not direct neighbors
  • D(v): current value of cost of path from source to dest. v
  • p(v): predecessor node along path from source to v
  • N': set of nodes whose least cost path definitively known

Network Layer

4-67

68 of 144

Dijsktra’s Algorithm

Network Layer

4-68

1 Initialization:

2 N' = {u}

3 for all nodes v

4 if v adjacent to u

5 then D(v) = c(u,v)

6 else D(v) = ∞

7

8 Loop

9 find w not in N' such that D(w) is a minimum

10 add w to N'

11 update D(v) for all v adjacent to w and not in N' :

12 D(v) = min( D(v), D(w) + c(w,v) )

13 /* new cost to v is either old cost to v or known

14 shortest path cost to w plus cost from w to v */

15 until all nodes in N'

69 of 144

Network Layer

4-69

w

3

4

v

x

u

5

3

7

4

y

8

z

2

7

9

Dijkstra’s algorithm: example

Step

N'

D(v)

p(v)

0

1

2

3

4

5

D(w)

p(w)

D(x)

p(x)

D(y)

p(y)

D(z)

p(z)

u

7,u

3,u

5,u

uw

11,w

6,w

5,u

14,x

11,w

6,w

uwx

uwxv

14,x

10,v

uwxvy

12,y

notes:

  • construct shortest path tree by tracing predecessor nodes
  • ties can exist (can be broken arbitrarily)

uwxvyz

70 of 144

Dijkstra’s algorithm: another example

Network Layer

4-70

Step

0

1

2

3

4

5

N'

u

ux

uxy

uxyv

uxyvw

uxyvwz

D(v),p(v)

2,u

2,u

2,u

D(w),p(w)

5,u

4,x

3,y

3,y

D(x),p(x)

1,u

D(y),p(y)

2,x

D(z),p(z)

4,y

4,y

4,y

u

y

x

w

v

z

2

2

1

3

1

1

2

5

3

5

71 of 144

Dijkstra’s algorithm: example (2)

Network Layer

4-71

u

y

x

w

v

z

resulting shortest-path tree from u:

v

x

y

w

z

(u,v)

(u,x)

(u,x)

(u,x)

(u,x)

destination

link

resulting forwarding table in u:

72 of 144

Dijkstra’s algorithm, discussion

algorithm complexity: n nodes

  • each iteration: need to check all nodes, w, not in N
  • n(n+1)/2 comparisons: O(n2)
  • more efficient implementations possible: O(nlogn)

oscillations possible:

  • e.g., support link cost equals amount of carried traffic:

Network Layer

4-72

A

D

C

B

1

1+e

e

0

e

1

1

0

0

initially

A

D

C

B

given these costs,

find new routing….

resulting in new costs

2+e

0

0

0

1+e

1

A

D

C

B

given these costs,

find new routing….

resulting in new costs

0

2+e

1+e

1

0

0

A

D

C

B

given these costs,

find new routing….

resulting in new costs

2+e

0

0

0

1+e

1

73 of 144

4.1 introduction

4.2 virtual circuit and datagram networks

4.3 what’s inside a router

4.4 IP: Internet Protocol

    • datagram format
    • IPv4 addressing
    • ICMP
    • IPv6

4.5 routing algorithms

    • link state
    • distance vector
    • hierarchical routing

4.6 routing in the Internet

    • RIP
    • OSPF
    • BGP

4.7 broadcast and multicast routing

Network Layer

4-73

Chapter 4: outline

74 of 144

Distance vector algorithm

Bellman-Ford equation (dynamic programming)

let

dx(y) := cost of least-cost path from x to y

then

dx(y) = min {c(x,v) + dv(y) }

Network Layer

4-74

v

cost to neighbor v

min taken over all neighbors v of x

cost from neighbor v to destination y

75 of 144

Bellman-Ford example

Network Layer

4-75

u

y

x

w

v

z

2

2

1

3

1

1

2

5

3

5

clearly, dv(z) = 5, dx(z) = 3, dw(z) = 3

du(z) = min { c(u,v) + dv(z),

c(u,x) + dx(z),

c(u,w) + dw(z) }

= min {2 + 5,

1 + 3,

5 + 3} = 4

node achieving minimum is next

hop in shortest path, used in forwarding table

B-F equation says:

76 of 144

Distance vector algorithm

  • Dx(y) = estimate of least cost from x to y
    • x maintains distance vector Dx = [Dx(y): y є N ]
  • node x:
    • knows cost to each neighbor v: c(x,v)
    • maintains its neighbors’ distance vectors. For each neighbor v, x maintains �Dv = [Dv(y): y є N ]

Network Layer

4-76

77 of 144

Distance vector algorithm

key idea:

  • from time-to-time, each node sends its own distance vector estimate to neighbors
  • when x receives new DV estimate from neighbor, it updates its own DV using B-F equation:

Network Layer

4-77

Dx(y) ← minv{c(x,v) + Dv(y)} for each node y ∊ N

  • under minor, natural conditions, the estimate Dx(y) converge to the actual least cost dx(y)

78 of 144

Distance vector algorithm

iterative, asynchronous: each local iteration caused by:

  • local link cost change
  • DV update message from neighbor

distributed:

  • each node notifies neighbors only when its DV changes
    • neighbors then notify their neighbors if necessary

Network Layer

4-78

wait for (change in local link cost or msg from neighbor)

recompute estimates

if DV to any dest has changed, notify neighbors

each node:

79 of 144

Network Layer

4-79

x y z

x

y

z

0 2 7

from

cost to

from

from

x y z

x

y

z

0

x y z

x

y

z

cost to

x y z

x

y

z

7

1

0

cost to

2 0 1

∞ ∞ ∞

2 0 1

7 1 0

time

x

z

1

2

7

y

node x

table

Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)} � = min{2+0 , 7+1} = 2

Dx(z) = min{c(x,y) + � Dy(z), c(x,z) + Dz(z)}

= min{2+1 , 7+0} = 3

3

2

node y

table

node z

table

cost to

from

80 of 144

Network Layer

4-80

x y z

x

y

z

0 2 3

from

cost to

x y z

x

y

z

0 2 7

from

cost to

x y z

x

y

z

0 2 3

from

cost to

x y z

x

y

z

0 2 3

from

cost to

x y z

x

y

z

0 2 7

from

cost to

2 0 1

7 1 0

2 0 1

3 1 0

2 0 1

3 1 0

2 0 1

3 1 0

2 0 1

3 1 0

time

x y z

x

y

z

0 2 7

from

cost to

from

from

x y z

x

y

z

0

x y z

x

y

z

cost to

x y z

x

y

z

7

1

0

cost to

2 0 1

∞ ∞ ∞

2 0 1

7 1 0

time

x

z

1

2

7

y

node x

table

Dx(y) = min{c(x,y) + Dy(y), c(x,z) + Dz(y)} � = min{2+0 , 7+1} = 2

Dx(z) = min{c(x,y) + � Dy(z), c(x,z) + Dz(z)}

= min{2+1 , 7+0} = 3

3

2

node y

table

node z

table

cost to

from

81 of 144

Distance vector: link cost changes

Network Layer

4-81

link cost changes:

  • node detects local link cost change
  • updates routing info, recalculates �distance vector
  • if DV changes, notify neighbors

“good

news

travels

fast”

x

z

1

4

50

y

1

t0 : y detects link-cost change, updates its DV, informs its neighbors.

t1 : z receives update from y, updates its table, computes new least cost to x , sends its neighbors its DV.

t2 : y receives z’s update, updates its distance table. y’s least costs do not change, so y does not send a message to z.

82 of 144

Distance vector: link cost changes

Network Layer

4-82

link cost changes:

  • node detects local link cost change
  • bad news travels slow - “count to infinity” problem!
  • 44 iterations before algorithm stabilizes: see text

x

z

1

4

50

y

60

poisoned reverse:

  • If Z routes through Y to get to X :
    • Z tells Y its (Z’s) distance to X is infinite (so Y won’t route to X via Z)
  • will this completely solve count to infinity problem?

83 of 144

Comparison of LS and DV algorithms

message complexity

  • LS: with n nodes, E links, O(nE) msgs sent
  • DV: exchange between neighbors only
    • convergence time varies

speed of convergence

  • LS: O(n2) algorithm requires O(nE) msgs
    • may have oscillations
  • DV: convergence time varies
    • may be routing loops
    • count-to-infinity problem

robustness: what happens if router malfunctions?

LS:

    • node can advertise incorrect link cost
    • each node computes only its own table

DV:

    • DV node can advertise incorrect path cost
    • each node’s table used by others
      • error propagate thru network

Network Layer

4-83

84 of 144

4.1 introduction

4.2 virtual circuit and datagram networks

4.3 what’s inside a router

4.4 IP: Internet Protocol

    • datagram format
    • IPv4 addressing
    • ICMP
    • IPv6

4.5 routing algorithms

    • link state
    • distance vector
    • hierarchical routing

4.6 routing in the Internet

    • RIP
    • OSPF
    • BGP

4.7 broadcast and multicast routing

Network Layer

4-84

Chapter 4: outline

85 of 144

Hierarchical routing

scale: with 600 million destinations:

  • can’t store all dest’s in routing tables!
  • routing table exchange would swamp links!

administrative autonomy

  • internet = network of networks
  • each network admin may want to control routing in its own network

Network Layer

4-85

our routing study thus far - idealization

  • all routers identical
  • network “flat”

… not true in practice

86 of 144

Hierarchical routing

  • aggregate routers into regions, “autonomous systems” (AS)
  • routers in same AS run same routing protocol
    • “intra-AS” routing protocol
    • routers in different AS can run different intra-AS routing protocol

gateway router:

  • at “edge” of its own AS
  • has link to router in another AS

Network Layer

4-86

87 of 144

Interconnected ASes

  • forwarding table configured by both intra- and inter-AS routing algorithm
    • intra-AS sets entries for internal dests
    • inter-AS & intra-AS sets entries for external dests

Network Layer

4-87

3b

1d

3a

1c

2a

AS3

AS1

AS2

1a

2c

2b

1b

Intra-AS

Routing

algorithm

Inter-AS

Routing

algorithm

Forwarding

table

3c

88 of 144

Inter-AS tasks

  • suppose router in AS1 receives datagram destined outside of AS1:
    • router should forward packet to gateway router, but which one?

AS1 must:

  1. learn which dests are reachable through AS2, which through AS3
  2. propagate this reachability info to all routers in AS1

job of inter-AS routing!

Network Layer

4-88

AS3

AS2

3b

3c

3a

AS1

1c

1a

1d

1b

2a

2c

2b

other

networks

other

networks

89 of 144

Example: setting forwarding table in router 1d

  • suppose AS1 learns (via inter-AS protocol) that subnet x reachable via AS3 (gateway 1c), but not via AS2
    • inter-AS protocol propagates reachability info to all internal routers
  • router 1d determines from intra-AS routing info that its interface I is on the least cost path to 1c
    • installs forwarding table entry (x,I)

Network Layer

4-89

AS3

AS2

3b

3c

3a

AS1

1c

1a

1d

1b

2a

2c

2b

other

networks

other

networks

x

90 of 144

Example: choosing among multiple ASes

  • now suppose AS1 learns from inter-AS protocol that subnet x is reachable from AS3 and from AS2.
  • to configure forwarding table, router 1d must determine which gateway it should forward packets towards for dest x
    • this is also job of inter-AS routing protocol!

Network Layer

4-90

AS3

AS2

3b

3c

3a

AS1

1c

1a

1d

1b

2a

2c

2b

other

networks

other

networks

x

……

?

91 of 144

Example: choosing among multiple ASes

  • now suppose AS1 learns from inter-AS protocol that subnet x is reachable from AS3 and from AS2.
  • to configure forwarding table, router 1d must determine towards which gateway it should forward packets for dest x
    • this is also job of inter-AS routing protocol!
  • hot potato routing: send packet towards closest of two routers.

Network Layer

4-91

learn from inter-AS

protocol that subnet

x is reachable via

multiple gateways

use routing info

from intra-AS

protocol to determine

costs of least-cost

paths to each

of the gateways

hot potato routing:

choose the gateway

that has the

smallest least cost

determine from

forwarding table the

interface I that leads

to least-cost gateway.

Enter (x,I) in

forwarding table

92 of 144

4.1 introduction

4.2 virtual circuit and datagram networks

4.3 what’s inside a router

4.4 IP: Internet Protocol

    • datagram format
    • IPv4 addressing
    • ICMP
    • IPv6

4.5 routing algorithms

    • link state
    • distance vector
    • hierarchical routing

4.6 routing in the Internet

    • RIP
    • OSPF
    • BGP

4.7 broadcast and multicast routing

Network Layer

4-92

Chapter 4: outline

93 of 144

Intra-AS Routing

  • also known as interior gateway protocols (IGP)
  • most common intra-AS routing protocols:
    • RIP: Routing Information Protocol
    • OSPF: Open Shortest Path First
    • IGRP: Interior Gateway Routing Protocol (Cisco proprietary)

Network Layer

4-93

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

94 of 144

RIP ( Routing Information Protocol)

  • included in BSD-UNIX distribution in 1982
  • distance vector algorithm
    • distance metric: # hops (max = 15 hops), each link has cost 1
    • DVs exchanged with neighbors every 30 sec in response message (aka advertisement)
    • each advertisement: list of up to 25 destination subnets (in IP addressing sense)

Network Layer

4-94

D

C

B

A

u

v

w

x

y

z

subnet hops

u 1

v 2

w 2

x 3

y 3

z 2

from router A to destination subnets:

95 of 144

RIP: example

Network Layer

4-95

destination subnet next router # hops to dest

w A 2

y B 2

z B 7

x -- 1

…. …. ....

routing table in router D

w

x

y

z

A

C

D

B

96 of 144

RIP: example

Network Layer

4-96

w

x

y

z

A

C

D

B

destination subnet next router # hops to dest

w A 2

y B 2

z B 7

x -- 1

…. …. ....

routing table in router D

A

5

dest next hops

w - 1

x - 1

z C 4

…. … ...

A-to-D advertisement

97 of 144

RIP: link failure, recovery

if no advertisement heard after 180 sec --> neighbor/link declared dead

    • routes via neighbor invalidated
    • new advertisements sent to neighbors
    • neighbors in turn send out new advertisements (if tables changed)
    • link failure info quickly (?) propagates to entire net
    • poison reverse used to prevent ping-pong loops (infinite distance = 16 hops)

Network Layer

4-97

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

98 of 144

RIP table processing

  • RIP routing tables managed by application-level process called route-d (daemon)
  • advertisements sent in UDP packets, periodically repeated

Network Layer

4-98

physical

link

network forwarding

(IP) table

transport

(UDP)

routed

physical

link

network

(IP)

transprt

(UDP)

routed

forwarding

table

99 of 144

OSPF (Open Shortest Path First)

  • “open”: publicly available
  • uses link state algorithm
    • LS packet dissemination
    • topology map at each node
    • route computation using Dijkstra’s algorithm
  • OSPF advertisement carries one entry per neighbor
  • advertisements flooded to entire AS
    • carried in OSPF messages directly over IP (rather than TCP or UDP
  • IS-IS routing protocol: nearly identical to OSPF

Network Layer

4-99

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

100 of 144

OSPF “advanced” features (not in RIP)

  • security: all OSPF messages authenticated (to prevent malicious intrusion)
  • multiple same-cost paths allowed (only one path in RIP)
  • for each link, multiple cost metrics for different TOS (e.g., satellite link cost set “low” for best effort ToS; high for real time ToS)
  • integrated uni- and multicast support:
    • Multicast OSPF (MOSPF) uses same topology data base as OSPF
  • hierarchical OSPF in large domains.

Network Layer

4-100

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

101 of 144

Hierarchical OSPF

Network Layer

4-101

boundary router

backbone router

area 1

area 2

area 3

backbone

area

border

routers

internal

routers

102 of 144

Hierarchical OSPF

  • two-level hierarchy: local area, backbone.
    • link-state advertisements only in area
    • each nodes has detailed area topology; only know direction (shortest path) to nets in other areas.
  • area border routers: “summarize” distances to nets in own area, advertise to other Area Border routers.
  • backbone routers: run OSPF routing limited to backbone.
  • boundary routers: connect to other AS’s.

Network Layer

4-102

103 of 144

Internet inter-AS routing: BGP

  • BGP (Border Gateway Protocol): the de facto inter-domain routing protocol
    • “glue that holds the Internet together”
  • BGP provides each AS a means to:
    • eBGP: obtain subnet reachability information from neighboring ASs.
    • iBGP: propagate reachability information to all AS-internal routers.
    • determine “good” routes to other networks based on reachability information and policy.
  • allows subnet to advertise its existence to rest of Internet: “I am here”

Network Layer

4-103

104 of 144

BGP basics

  • when AS3 advertises a prefix to AS1:
    • AS3 promises it will forward datagrams towards that prefix
    • AS3 can aggregate prefixes in its advertisement

Network Layer

4-104

AS3

AS2

3b

3c

3a

AS1

1c

1a

1d

1b

2a

2c

2b

other

networks

other

networks

  • BGP session: two BGP routers (“peers”) exchange BGP messages:
    • advertising paths to different destination network prefixes (“path vector” protocol)
    • exchanged over semi-permanent TCP connections

BGP

message

105 of 144

BGP basics: distributing path information

  • using eBGP session between 3a and 1c, AS3 sends prefix reachability info to AS1.
    • 1c can then use iBGP do distribute new prefix info to all routers in AS1
    • 1b can then re-advertise new reachability info to AS2 over 1b-to-2a eBGP session
  • when router learns of new prefix, it creates entry for prefix in its forwarding table.

Network Layer

4-105

AS3

AS2

3b

3a

AS1

1c

1a

1d

1b

2a

2c

2b

other

networks

other

networks

eBGP session

iBGP session

106 of 144

Path attributes and BGP routes

  • advertised prefix includes BGP attributes
    • prefix + attributes = “route”
  • two important attributes:
    • AS-PATH: contains ASs through which prefix advertisement has passed: e.g., AS 67, AS 17
    • NEXT-HOP: indicates specific internal-AS router to next-hop AS. (may be multiple links from current AS to next-hop-AS)
  • gateway router receiving route advertisement uses import policy to accept/decline
    • e.g., never route through AS x
    • policy-based routing

Network Layer

4-106

107 of 144

BGP route selection

  • router may learn about more than 1 route to destination AS, selects route based on:
    1. local preference value attribute: policy decision
    2. shortest AS-PATH
    3. closest NEXT-HOP router: hot potato routing
    4. additional criteria

Network Layer

4-107

108 of 144

BGP messages

  • BGP messages exchanged between peers over TCP connection
  • BGP messages:
    • OPEN: opens TCP connection to peer and authenticates sender
    • UPDATE: advertises new path (or withdraws old)
    • KEEPALIVE: keeps connection alive in absence of UPDATES; also ACKs OPEN request
    • NOTIFICATION: reports errors in previous msg; also used to close connection

Network Layer

4-108

109 of 144

Putting it Altogether:�How Does an Entry Get Into a Router’s Forwarding Table?

  • Answer is complicated!

  • Ties together hierarchical routing (Section 4.5.3) with BGP (4.6.3) and OSPF (4.6.2).�
  • Provides nice overview of BGP!�

110 of 144

How does entry get in forwarding table?

1

2

3

Dest IP

routing algorithms

local forwarding table

prefix

output port

138.16.64/22

124.12/16

212/8

…………..

3

2

4

entry

Assume prefix is�in another AS.

111 of 144

High-level overview

  1. Router becomes aware of prefix
  2. Router determines output port for prefix
  3. Router enters prefix-port in forwarding table

How does entry get in forwarding table?

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

112 of 144

Router becomes aware of prefix

AS3

AS2

3b

3c

3a

AS1

1c

1a

1d

1b

2a

2c

2b

other

networks

other

networks

BGP

message

  • BGP message contains “routes”
  • “route” is a prefix and attributes: AS-PATH, NEXT-HOP,…
  • Example: route:
    • Prefix:138.16.64/22 ; AS-PATH: AS3 AS131 ; NEXT-HOP: 201.44.13.125

113 of 144

Router may receive multiple routes

AS3

AS2

3b

3c

3a

AS1

1c

1a

1d

1b

2a

2c

2b

other

networks

other

networks

BGP

message

  • Router may receive multiple routes for same prefix
  • Has to select one route

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

114 of 144

Select best BGP route to prefix

  • Router selects route based on shortest AS-PATH
  • Example:�
    • AS2 AS17 to 138.16.64/22
    • AS3 AS131 AS201 to 138.16.64/22

  • What if there is a tie? We’ll come back to that!

select

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

115 of 144

Find best intra-route to BGP route

  • Use selected route’s NEXT-HOP attribute
    • Route’s NEXT-HOP attribute is the IP address of the router interface that begins the AS PATH.
  • Example:
    • AS-PATH: AS2 AS17 ; NEXT-HOP: 111.99.86.55
  • Router uses OSPF to find shortest path from 1c to 111.99.86.55

AS3

AS2

3b

3c

3a

AS1

1c

1a

1d

1b

2a

2c

2b

other

networks

other

networks

111.99.86.55

116 of 144

Router identifies port for route

  • Identifies port along the OSPF shortest path
  • Adds prefix-port entry to its forwarding table:
    • (138.16.64/22 , port 4)

AS3

AS2

3b

3c

3a

AS1

1c

1a

1d

1b

2a

2c

2b

other

networks

other

networks

router�port

1

2

3

4

117 of 144

Hot Potato Routing

  • Suppose there two or more best inter-routes.
  • Then choose route with closest NEXT-HOP
    • Use OSPF to determine which gateway is closest
    • Q: From 1c, chose AS3 AS131 or AS2 AS17?
    • A: route AS3 AS201 since it is closer

AS3

AS2

3b

3c

3a

AS1

1c

1a

1d

1b

2a

2c

2b

other

networks

other

networks

118 of 144

How does entry get in forwarding table?

Summary

  1. Router becomes aware of prefix
    • via BGP route advertisements from other routers
  2. Determine router output port for prefix
    • Use BGP route selection to find best inter-AS route
    • Use OSPF to find best intra-AS route leading to best inter-AS route
    • Router identifies router port for that best route
  3. Enter prefix-port entry in forwarding table

119 of 144

BGP routing policy

Network Layer

4-119

  • A,B,C are provider networks
  • X,W,Y are customer (of provider networks)
  • X is dual-homed: attached to two networks
    • X does not want to route from B via X to C
    • .. so X will not advertise to B a route to C

A

B

C

W

X

Y

legend:

customer

network:

provider

network

120 of 144

BGP routing policy (2)

Network Layer

4-120

  • A advertises path AW to B
  • B advertises path BAW to X
  • Should B advertise path BAW to C?
    • No way! B gets no “revenue” for routing CBAW since neither W nor C are B’s customers
    • B wants to force C to route to w via A
    • B wants to route only to/from its customers!

A

B

C

W

X

Y

legend:

customer

network:

provider

network

121 of 144

Why different Intra-, Inter-AS routing ?

policy:

  • inter-AS: admin wants control over how its traffic routed, who routes through its net.
  • intra-AS: single admin, so no policy decisions needed

scale:

  • hierarchical routing saves table size, reduced update traffic

performance:

  • intra-AS: can focus on performance
  • inter-AS: policy may dominate over performance

Network Layer

4-121

122 of 144

4.1 introduction

4.2 virtual circuit and datagram networks

4.3 what’s inside a router

4.4 IP: Internet Protocol

    • datagram format
    • IPv4 addressing
    • ICMP
    • IPv6

4.5 routing algorithms

    • link state
    • distance vector
    • hierarchical routing

4.6 routing in the Internet

    • RIP
    • OSPF
    • BGP

4.7 broadcast and multicast routing

Network Layer

4-122

Chapter 4: outline

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

123 of 144

Broadcast routing

  • deliver packets from source to all other nodes
  • source duplication is inefficient:

Network Layer

4-123

R1

R2

R3

R4

source�duplication

R1

R2

R3

R4

in-network

duplication

duplicate

creation/transmission

duplicate

duplicate

  • source duplication: how does source determine recipient addresses?

124 of 144

In-network duplication

  • flooding: when node receives broadcast packet, sends copy to all neighbors
    • problems: cycles & broadcast storm
  • controlled flooding: node only broadcasts pkt if it hasn’t broadcast same packet before
    • node keeps track of packet ids already broadacsted
    • or reverse path forwarding (RPF): only forward packet if it arrived on shortest path between node and source
  • spanning tree:
    • no redundant packets received by any node

Network Layer

4-124

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

125 of 144

Spanning tree

  • first construct a spanning tree
  • nodes then forward/make copies only along spanning tree

Network Layer

4-125

A

B

G

D

E

c

F

A

B

G

D

E

c

F

(a) broadcast initiated at A

(b) broadcast initiated at D

126 of 144

Spanning tree: creation

  • center node
  • each node sends unicast join message to center node
    • message forwarded until it arrives at a node already belonging to spanning tree

Network Layer

4-126

A

B

G

D

E

c

F

1

2

3

4

5

  1. stepwise construction of spanning tree (center: E)

A

B

G

D

E

c

F

(b) constructed spanning tree

127 of 144

Multicast routing: problem statement

goal: find a tree (or trees) connecting routers having local mcast group members

  • tree: not all paths between routers used
  • shared-tree: same tree used by all group members

Network Layer

4-127

shared tree

source-based trees

group

member

not group

member

router

with a

group

member

router

without

group

member

legend

  • source-based: different tree from each sender to rcvrs

128 of 144

Approaches for building mcast trees

approaches:

  • source-based tree: one tree per source
    • shortest path trees
    • reverse path forwarding
  • group-shared tree: group uses one tree
    • minimal spanning (Steiner)
    • center-based trees

Network Layer

4-128

…we first look at basic approaches, then specific protocols adopting these approaches

129 of 144

Shortest path tree

  • mcast forwarding tree: tree of shortest path routes from source to all receivers
    • Dijkstra’s algorithm

Network Layer

4-129

i

router with attached

group member

router with no attached

group member

link used for forwarding,

i indicates order link

added by algorithm

LEGEND

R1

R2

R3

R4

R5

R6

R7

2

1

6

3

4

5

s: source

130 of 144

Reverse path forwarding

if (mcast datagram received on incoming link on shortest path back to center)

then flood datagram onto all outgoing links

else ignore datagram

Network Layer

4-130

  • rely on router’s knowledge of unicast shortest path from it to sender
  • each router has simple forwarding behavior:

131 of 144

Reverse path forwarding: example

Network Layer

4-131

  • result is a source-specific reverse SPT
    • may be a bad choice with asymmetric links

router with attached

group member

router with no attached

group member

datagram will be forwarded

LEGEND

R1

R2

R3

R4

R5

R6

R7

s: source

datagram will not be

forwarded

132 of 144

Reverse path forwarding: pruning

  • forwarding tree contains subtrees with no mcast group members
    • no need to forward datagrams down subtree
    • “prune” msgs sent upstream by router with no downstream group members

Network Layer

4-132

router with attached

group member

router with no attached

group member

prune message

LEGEND

links with multicast

forwarding

P

R1

R2

R3

R4

R5

R6

R7

s: source

P

P

133 of 144

Shared-tree: steiner tree

  • steiner tree: minimum cost tree connecting all routers with attached group members
  • problem is NP-complete
  • excellent heuristics exists
  • not used in practice:
    • computational complexity
    • information about entire network needed
    • monolithic: rerun whenever a router needs to join/leave

Network Layer

4-133

134 of 144

Center-based trees

  • single delivery tree shared by all
  • one router identified as “center” of tree
  • to join:
    • edge router sends unicast join-msg addressed to center router
    • join-msg “processed” by intermediate routers and forwarded towards center
    • join-msg either hits existing tree branch for this center, or arrives at center
    • path taken by join-msg becomes new branch of tree for this router

Network Layer

4-134

135 of 144

Center-based trees: example

Network Layer

4-135

suppose R6 chosen as center:

router with attached

group member

router with no attached

group member

path order in which join messages generated

LEGEND

2

1

3

1

R1

R2

R3

R4

R5

R6

R7

136 of 144

Internet Multicasting Routing: DVMRP

  • DVMRP: distance vector multicast routing protocol, RFC1075
  • flood and prune: reverse path forwarding, source-based tree
    • RPF tree based on DVMRP’s own routing tables constructed by communicating DVMRP routers
    • no assumptions about underlying unicast
    • initial datagram to mcast group flooded everywhere via RPF
    • routers not wanting group: send upstream prune msgs

Network Layer

4-136

137 of 144

DVMRP: continued…

  • soft state: DVMRP router periodically (1 min.) “forgets” branches are pruned:
    • mcast data again flows down unpruned branch
    • downstream router: reprune or else continue to receive data
  • routers can quickly regraft to tree
    • following IGMP join at leaf
  • odds and ends
    • commonly implemented in commercial router

Network Layer

4-137

138 of 144

Tunneling

Q: how to connect “islands” of multicast routers in a “sea” of unicast routers?

Network Layer

4-138

  • mcast datagram encapsulated inside “normal” (non-multicast-addressed) datagram
  • normal IP datagram sent thru “tunnel” via regular IP unicast to receiving mcast router (recall IPv6 inside IPv4 tunneling)
  • receiving mcast router unencapsulates to get mcast datagram

physical topology

logical topology

139 of 144

PIM: Protocol Independent Multicast

  • not dependent on any specific underlying unicast routing algorithm (works with all)
  • two different multicast distribution scenarios :

Network Layer

4-139

dense:

  • group members densely packed, in “close” proximity.
  • bandwidth more plentiful

sparse:

  • # networks with group members small wrt # interconnected networks
  • group members “widely dispersed”
  • bandwidth not plentiful

140 of 144

Consequences of sparse-dense dichotomy:

dense

  • group membership by routers assumed until routers explicitly prune
  • data-driven construction on mcast tree (e.g., RPF)
  • bandwidth and non-group-router processing profligate

sparse:

  • no membership until routers explicitly join
  • receiver- driven construction of mcast tree (e.g., center-based)
  • bandwidth and non-group-router processing conservative

Network Layer

4-140

141 of 144

PIM- dense mode

Network Layer

4-141

flood-and-prune RPF: similar to DVMRP but…

  • underlying unicast protocol provides RPF info for incoming datagram
  • less complicated (less efficient) downstream flood than DVMRP reduces reliance on underlying routing algorithm
  • has protocol mechanism for router to detect it is a leaf-node router

RCEW, Pasupula (V), Nandikotkur Road,

Near Venkayapalli, KURNOOL

142 of 144

PIM - sparse mode

  • center-based approach
  • router sends join msg to rendezvous point (RP)
    • intermediate routers update state and forward join
  • after joining via RP, router can switch to source-specific tree
    • increased performance: less concentration, shorter paths

Network Layer

4-142

all data multicast

from rendezvous

point

rendezvous

point

join

join

join

R1

R2

R3

R4

R5

R6

R7

143 of 144

PIM - sparse mode

sender(s):

  • unicast data to RP, which distributes down RP-rooted tree
  • RP can extend mcast tree upstream to source
  • RP can send stop msg if no attached receivers
    • “no one is listening!”

Network Layer

4-143

all data multicast

from rendezvous

point

rendezvous

point

join

join

join

R1

R2

R3

R4

R5

R6

R7

144 of 144

4.1 introduction

4.2 virtual circuit and datagram networks

4.3 what’s inside a router

4.4 IP: Internet Protocol

    • datagram format, IPv4 addressing, ICMP, IPv6

4.5 routing algorithms

    • link state, distance vector, hierarchical routing

4.6 routing in the Internet

    • RIP, OSPF, BGP

4.7 broadcast and multicast routing

Network Layer

4-144

Chapter 4: done!

  • understand principles behind network layer services:
    • network layer service models, forwarding versus routing how a router works, routing (path selection), broadcast, multicast
  • instantiation, implementation in the Internet