1 of 30

Demo:

MidoNet VPC Peering

a.k.a. “router peering”

September 2015

2 of 30

Agenda

  1. What is MidoNet
  2. What is VPC Peering (cross-WAN virtual router peering)
  3. Augmenting MN Router with VXLAN capability
  4. Router peering with VTEP-Router
  5. What is Direct Connect
  6. Router Peering orchestration API/workflow
  7. Demo (peer routers across 2 all-in-one nodes)

Demo based on:

https://github.com/gdecandia/midonet-1/tree/peering2

3 of 30

MidoNet transforms this...

Bare Metal

Server

Bare Metal

Server

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

IP Fabric

4 of 30

into this...

Bare Metal

Server

Bare Metal

Server

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

VM

FW

LB

FW

LB

Internet/

WAN

FW

5 of 30

What is VPC Peering?

  • The AWS term for a link from router A in Zone 1 to router B in Zone 2 of the same Region.
  • Allows Tenants to connect their resources in different Zones using private addresses.
  • Traffic stays within the cloud provider’s WAN. The tenant usually does not encrypt the traffic.

6 of 30

What is VPC Peering?

VM

VM

VM

VM

VM

VM

VM

VM

10.0.0.0/24

10.0.0.0/24

10.0.1.0/24

10.0.1.0/24

Floating IP Range1

Floating IP Range2

Internet

Private WAN

Peering links carry private address traffic

Site 1

Site 2

7 of 30

MN Virtual Router augmented for VTEP

P1 Remote MAC Table

VNI = 123

MAC1 -> tun IP1

MAC2 -> tun IP2

MAC3 -> tun IP3

… and so on...

P2 Remote MAC Table

VNI = 99

MAC4 -> tun IP2

MAC5 -> tun IP1

MAC6 -> tun IP2

… and so on...

From 0.0.0.0/0 to 0.0.0.0/0 via <Gateway>, P3

… can have other routes… it’s a full router.

Incoming VXLAN traffic is decap’ed + emitted from appropriate L2 port (based on VNI)

Traffic that ingresses an L2 port is VXLAN encap’ed with src=<P3’s IP address> and dst=<tunnel for remote MAC>, then sent to routing table.

P1

P2

P3

8 of 30

Cross-site L2 using overlay VTEP

L2 Frame

VXLAN

UDP+IP+Ethernet

WAN

or DC Network

L2 Frame

L2 Frame

L2 Frame

VXLAN

UDP+IP+Ethernet

L2 Frame

VXLAN

UDP+IP+Ethernet

L2 Frame

VXLAN

UDP+IP+Ethernet

L2 Frame

L2 Frame

9 of 30

Cross-site router peering in MN

VM

VM

VM

VM

VM

VM

VM

VM

10.0.0.0/24

10.0.0.0/24

10.0.1.0/24

10.0.1.0/24

Floating IP Range1

Floating IP Range2

Internet

Private WAN

vxlan packets

vxlan packets

Site 1

Site 2

VTEP-

capable virtual router

VTEP-

capable virtual router

10 of 30

Allows >2 routers on VXLAN segment

VM

VM

10.0.0.0/24

VM

VM

10.0.1.0/24

VM

VM

10.0.2.0/24

VM

VM

10.0.0.0/24

VM

VM

10.0.1.0/24

VM

VM

10.0.2.0/24

11 of 30

Allows >2 routers on a VXLAN segment

VM

VM

VM

VM

10.0.0.0/24

10.0.0.0/24

Floating IP Range1

Private WAN

vxlan packets

VM

VM

VM

VM

10.0.1.0/24

10.0.1.0/24

Floating IP Range1

vxlan packets

VM

VM

VM

VM

10.0.2.0/24

10.0.2.0/24

Floating IP Range1

vxlan packets

Site 1

Site 2

Site 3

12 of 30

N routers can all be connected with 1 L2 segment instead of O(N^2) p2p links in AWS VPC peering

13 of 30

What is Direct Connect?

  • The AWS term for a private connection between a VPC and the customer’s physical router in a Peering Facility.
  • Azure’s equivalent feature is Express Route
  • AWS DirectConnect also generates BGP configuration for Juniper/Cisco (others?) routers to make the customer-side setup easier.
  • DirectConnect only allows one vlan on the physical router to connect to one VPC.

14 of 30

Direct Connect with MN and HW VTEP

VM

VM

VM

VM

10.0.0.0/24

10.0.0.0/24

Floating IP Range1

Private WAN

vxlan packets

vxlan packets

Hardware VTEP running OVSDB server. MN’s OVSDB clients manages this.

VTEP-capable virtual router

L2 Ports are VLAN trunks, can carry up to 4K distinct vlans. Each port/vlan pair will be mapped to one project/tenant cross-site segment.

Site 1

Peering Facility

Customer physical router

15 of 30

MN VTEP instead of HW VTEP

VM

VM

VM

VM

10.0.0.0/24

10.0.0.0/24

Floating IP Range1

Private WAN

vxlan packets

vxlan packets

untagged vlan 10 port

trunk port

New VTEP-capable virtual router

VLAN-aware bridge

By binding the trunk port to a physical NIC, we can build DirectConnect with x86 hardware. Use 1 VLAN-aware bridge per available physical NIC.

untagged vlan 20 port

Site 1

Peering Facility

16 of 30

MN VTEP instead of HW VTEP

VM

VM

VM

VM

10.0.0.0/24

10.0.0.0/24

Floating IP Range1

Private WAN

vxlan packets

vxlan packets

trunk port

New VTEP-capable virtual router

VLAN-aware bridges each support 4k untagged vlan ports

trunk port

By binding trunk ports to physical NICs, we can build DirectConnect with x86 hardware.

Site 1

Peering Facility

17 of 30

Connecting to local VRF router

VM

VM

VM

VM

10.0.0.0/24

10.0.0.0/24

Floating IP Range1

trunk port

VLAN-aware bridge

VRF-capable physical router

virtual

physical

This was already possible in MN2015.01 (or v1.8+)

18 of 30

Connecting to remote VRF router

New VTEP-capable virtual router

VM

VM

VM

VM

10.0.0.0/24

10.0.0.0/24

Floating IP Range1

Private WAN

vxlan packets

vxlan packets

vlan 10

vlan 20

trunk port

VLAN-aware bridge

VRF-capable physical router

virtual

physical

19 of 30

Implementation: Low-level Network Objects

RouterPort has been augemented with:

  • VNI integer
    • if non-zero, the port behaves as an L2 port in a VXLAN Logical Switch
    • all L2 ports on the router must have unique VNIs
  • MAC->VTEP map
    • should be populated by a control plane or orchestrator
  • (experimental optimization) “offRampVxlan” boolean
    • if True, any Hypervisor can send VXLAN traffic directly to VTEPs at other sites without first sending to the L3 Gateway.
  • (unused) default_remote_vtep IP address
    • if MAC is not found in the map, send the VXLAN traffic to this IP
  • (lazy/will-fix) localVtepIp
    • the source address of VXLAN packets sent by a VTEP router

20 of 30

Tentative: High-level 1-site Orchestration Objects

RouterVtepBinding:

  • vtepId
  • vni
  • routerId
  • routerPortIP
  • routerPortMAC

RouterVtepRoute:

  • bindingId
  • remoteCIDR
  • gwIP (for CIDR)
  • gwMAC
  • remoteVtepIP

21 of 30

Tentative: Orchestrator-oriented Workflow (per-Site)

1-time VTEP setup (by admin):

  • Create a virtual router to act as a VTEP. It will send/receive VXLAN packets, and encap/decap them on behalf of other routers.

When a Tenant Router wants to peer with other sites:

  • Create RouterVtepBinding - this object links the tenant router to the (admin) VTEP router and specifies the VNI.

Because Tenant Router doesn’t have IPAM or Dynamic Routing, we need to explicitly set routes to other sites:

  • Create RouterVtepRoute - this object associates the Binding with a specific remote CIDR, remote Router IP (and MAC to avoid ARP), and remote VTEP IP (caller/orchestrator is acting as the VXLAN control plane).

22 of 30

Create RouterVtepBinding

P2 Logical Switch

VNI = 10001

LocalVtep = IP of P1 (200.200.200.3)

P2

P1

router-vtep-binding add vtep VtepRouter router TenantRouter vtep-ip 200.200.200.3 router-subnet 192.168.123.0/29 router-ip 192.168.123.1 router-mac 02:bb:aa:aa:dd:01 vni 10001

Vtep-Router’s Forwarding Table

(WAN is 200.200.200.0/24)

Dst 200.200.200.0/24 via PORT P1 (on-link)

Dst 200.200.200.3/32 LOCAL

Vtep Router

Tenant Router

Tenant Router’s Forwarding Table

P3

Tenant Router’s Forwarding Table

Dst 192.168.123.0/29 via PORT P3 (on-link)

Dst 192.168.123.1/32 LOCAL

23 of 30

Create RouterVtepRoute

P1 Logical Switch

VNI = 10001

LocalVtep = P3

P2

P1

router-vtep-route add binding router-vtep-binding0 cidr 10.2.0.0/16 gw-ip 192.168.123.2 gw-mac 02:bb:aa:aa:dd:02 remote-vtep-ip 200.200.200.8

Vtep-Router’s Forwarding Table

(WAN is 200.200.200.0/24)

Dst 200.200.200.0/24 via PORT P1 (on-link)

Dst 200.200.200.3/32 LOCAL

Vtep Router

Tenant Router

Tenant Router’s Forwarding Table

P3

Tenant Router’s Forwarding Table

Dst 192.168.123.0/29 via PORT P3 (on-link)

Dst 192.168.123.0/32 LOCAL

P1 Logical Switch

VNI = 10001

LocalVtep = IP of P1 (200.200.200.3)

02:bb:aa:aa:dd:02 => 200.200.200.8

Tenant Router’s Forwarding Table

Dst 192.168.123.0/29 via PORT P3 (on-link)

Dst 192.168.123.1/32 LOCAL

Dst 10.2.0.0/16 via 192.168.123.2

Tenant Router’s

Neighbor Table

192.168.123.2 =>

02:bb:aa:aa:dd:02

24 of 30

Demo setup

VM

VM

VM

VM

10.0.0.0/24

10.0.0.0/24

Site 1

Site 2

VTEP-

capable virtual router

All-in-one

Host 1

All-in-one

Host 2

Floating IP Range1

VM

VM

VM

VM

10.0.1.0/24

10.0.1.0/24

VTEP-

capable virtual router

Floating IP Range2

Tenant Routers without Gateway

Tenant Routers without Gateway

Neutron Network in “under cloud” (MidoCloud)

200.200.200.3

200.100.100.3

192.168.192.0/24

25 of 30

Tcpdump output

10:45:28.807563 fa:16:3e:4e:0c:4f (oui Unknown) > fa:16:3e:9c:26:9e (oui Unknown), ethertype IPv4 (0x0800), length 148: 200-100-100-3.dial-up.telesp.net.br.medimageportal > 200.200.200.3.4789: VXLAN, flags [I] (0x08), vni 10001

02:bb:aa:aa:dd:03 (oui Unknown) > 02:bb:aa:aa:dd:01 (oui Unknown), ethertype IPv4 (0x0800), length 98: 10.0.0.3 > 10.1.0.3: ICMP echo request, id 25345, seq 1, length 64

10:45:28.815983 fa:16:3e:9c:26:9e (oui Unknown) > fa:16:3e:4e:0c:4f (oui Unknown), ethertype IPv4 (0x0800), length 148: 200.200.200.3.28351 > 200-100-100-3.dial-up.telesp.net.br.4789: VXLAN, flags [I] (0x08), vni 10001

02:bb:aa:aa:dd:01 (oui Unknown) > 02:bb:aa:aa:dd:03 (oui Unknown), ethertype IPv4 (0x0800), length 98: 10.1.0.3 > 10.0.0.3: ICMP echo reply, id 25345, seq 1, length 64

Host 1 MAC: fa:16:3e:4e:0c:4f

Host 2 MAC: fa:16:3e:9c:26:9e

VTEP1: 200.200.200.3

VTEP2: 200.100.100.3

Router1 MAC: 02:bb:aa:aa:dd:01

Rotuer2 MAC: 02:bb:aa:aa:dd:03

VM1 IP: 10.1.0.3

VM2 IP: 10.0.0.3

26 of 30

Next time: add Direct Connect to this Demo!

VM

VM

VM

VM

10.0.0.0/24

10.0.0.0/24

Site 1

Site 2

VTEP-

capable virtual router

All-in-one

Host 1

All-in-one

Host 2

Floating IP Range1

VM

VM

VM

VM

10.0.1.0/24

10.0.1.0/24

VTEP-

capable virtual router

Floating IP Range2

Tenant Routers without Gateway

Tenant Routers without Gateway

Neutron Network in “under cloud” (MidoCloud)

(Emulated) Hardware

VTEP running OVSDB server controlled by MidoNet

Customer physical routers

Peering Facility

27 of 30

Site 1 midonet-cli commands

> router list

router router0 name Router1 state up infilter chain2 outfilter chain3 asn -1

router router1 name A state up infilter chain4 outfilter chain5 asn -1

router router2 name B state up infilter chain6 outfilter chain7 asn -1

A1: 10.1.0.0/24

> router-vtep-binding add vtep router0 router router1 vtep-ip 200.200.200.3 router-subnet 192.168.123.0/29 router-ip 192.168.123.1 router-mac 02:bb:aa:aa:dd:01 vni 10001

> router-vtep-route add binding router-vtep-binding0 cidr 10.0.0.0/16 gw-ip 192.168.123.2 gw-mac 02:bb:aa:aa:dd:02 remote-vtep-ip 200.100.100.3

B1: 10.1.0.0/24

> router-vtep-binding add vtep router0 router router2 vtep-ip 200.200.200.3 router-subnet 192.168.123.0/29 router-ip 192.168.123.1 router-mac 02:bb:aa:aa:dd:01 vni 20002

> router-vtep-route add binding router-vtep-binding1 cidr 10.0.0.0/16 gw-ip 192.168.123.2 gw-mac 02:bb:aa:aa:dd:02 remote-vtep-ip 200.100.100.3

28 of 30

Site 2 midonet-cli commands

> router list

router router0 name Router1 state up infilter chain0 outfilter chain1 asn -1

router router1 name A state up infilter chain2 outfilter chain3 asn -1

router router2 name B state up infilter chain4 outfilter chain5 asn -1

A2: 10.0.0.0/24

> router-vtep-binding add vtep router0 router router1 vtep-ip 200.100.100.3 router-subnet 192.168.123.0/29 router-ip 192.168.123.2 router-mac 02:bb:aa:aa:dd:02 vni 10001

> router-vtep-route add binding router-vtep-binding0 cidr 10.1.0.0/16 gw-ip 192.168.123.1 gw-mac 02:bb:aa:aa:dd:01 remote-vtep-ip 200.200.200.3

B2: 10.0.0.0/24

> router-vtep-binding add vtep router0 router router2 vtep-ip 200.100.100.3 router-subnet 192.168.123.0/29 router-ip 192.168.123.2 router-mac 02:bb:aa:aa:dd:02 vni 20002

> router-vtep-route add binding router-vtep-binding1 cidr 10.1.0.0/16 gw-ip 192.168.123.1 gw-mac 02:bb:aa:aa:dd:01 remote-vtep-ip 200.200.200.3

29 of 30

midonet-cli commands - cleaning up

> router-vtep-route router-vtep-route0 delete

> router-vtep-route router-vtep-route1 delete

> router-vtep-binding router-vtep-binding0 delete

> router-vtep-binding router-vtep-binding1 delete

> router-vtep-route list

> router-vtep-binding list

30 of 30

Tunneling Choices

Can the Hypervisors send traffic directly to remote VTEPs?

  • Yes - then VXLAN tunnel directly from hypervisor to remote VTEP. Also send a copy of Flow State to the L3 Gateways, needed by the return flow.
  • No - then need to send the VXLAN traffic to the L3 Gateways first.
    • In prototype: double-encapsulation.
    • Future: single-encapsulation - modify the VXLAN header at the L3 Gateway.