1 of 14

Upgrade Research - Ordering Issues

2 of 14

Introduction

  • The following is a sample of “order” issues found in the process of research into this “upgrade” flow:
    • Shutdown ODL, wipe data store, restart ODL (BUT DO NOT ALLOW OVS to reconnect)
    • Wait for net-odl full sync from neutron
    • Wipe OVS flows and groups, reconnect OVS
  • Two scenarios were tested
    • Dual-node w/ auto-created vxlan tunnels
    • Single node w/ external network and floating IP (the examples in this presentation are from there)

3 of 14

Single node w/ external net and floating ip

4 of 14

Default NAT Flow for External Network

This flow is not present on OVS after “upgrade”:

table=21, priority=10,ip,metadata=0x30d42/0xfffffe actions=group:225000 (checks the VRF id and forwards to a group that outputs to the external interface)

The flow is in fact written by ODL but rejected by OVS because group:225000 is not present at the time the flow is written.

Group:225000 is not written because at the time that is triggered (twice, actually) there is no /elan-interfaces/elan-interface for the external network (trunk) port because OVS has not yet connected.

5 of 14

FloatingIpListener.createNatFlowEntries

This method depends on information from SouthBound that is not present before OVS connects.

Zero returned here due to lack of /interfaces-state/interface

6 of 14

FloatingIpListener.createNatFlowEntries II

Later in that method a call is made to VpnFloatingIpHandler.onAddFloatingIp where we find:

A few frames up-stack from here (NatEvpnUtil. addRoutesForVxLanProvType) the flow is aborted due to this null value

nextHopIp is null because DPN-TEPs-info not created yet

7 of 14

VpnManagerImpl.setupArpResponderFlowsToExternalNetworkIps

Null because /elan-dpn-interfaces/etc. Does not exist yet. It’s only created once OVS connects.

8 of 14

Issue with order of net-odl sync

  • During full sync n-odl sends routers before ports (due to a weird dependency that runs backwards)
  • During upgrade this causes failures because the port object for the routers external interface is missing when the router is created

9 of 14

Dual-node w/ auto-created vxlan tunnels

10 of 14

VxLan tunnel flows not present after upgrade (genius)

  1. Regular application flow
    1. Multiple OVS’s connect, ODL decides to set up transport zone
    2. Config interface objects created
    3. Actual TEPS attached to OVS which triggers…
    4. /interfaces-state/interface object created w/ ifindex
    5. Flows created
  2. Upgrade flow
    • Single OVS connects…
    • ...TEP triggers genius…
    • /interfaces-state/interface object created but no ifindex due to lack of a config interface
    • Flows not created

11 of 14

Proposed Infrastructure for fixes

12 of 14

Global “upgrading” flag

  • Fixing the above issues requires checking for ordering that does not generally occur in regular application flow
  • Would be better to not burden the regular application flow with these checks where unnecessary
  • Global “upgrading” flag would turn on/off these checks
  • Set and unset via REST by upgrade scripts/tools

13 of 14

Generic “wait for X and execute this lambda”

  • Waits for a specific object to appear in md-sal, calls a function object, disables itself
  • The following is from a rough patch of mine for one of the issues above.

14 of 14

Changes - also see COMMENTS on previous slides...