1 of 15

SDN Theory & Best Practices

Nick Buraglio

Network Engineer, ESnet

Lawrence Berkeley National Laboratory

GPN Lecture Series

04/08/2015

2 of 15

Background

  • 18 years designing, installing, and supporting large production HPC, campus, and service provider networks
  • SDN involvement since 2009
  • Primarily focused on OpenFlow as a control protocol
  • Other SDN platforms opportunistically
    • In-house built
    • SNMP
    • Ansible
    • etc…
  • Adjunct security engineer

3 of 15

Transitioning from research to production

  • ...or at least out of the research lab
    • The most difficult step
      • Will uncover unanticipated issues
      • Will aid in assessing supportability
      • Will assist in building operational model
      • Will not prove total viability (only time can do that)

4 of 15

“It works in mininet”

  • The most common statement / assumption is “It works in mininet”
    • Mininet is a software implementation - good for initial proof of concept
    • Extensive testing of x,y,z needs to be done on actual hardware before it can be considered valid
    • “Break it before you trust it”

5 of 15

All hardware is not created equal

  • Proprietary SDN solutions are generally locked into ASIC or custom [read: vendor locked] software on merchant chips
  • OpenFlow 1.3 has a wide variance of support
    • Different vendors have wildly different implementations of the optional features
    • Different platforms also have varying support for required features

6 of 15

Controllers: One ring to rule them all(?)

  • No assumptions should be made about controllers
    • Most significant attack vector
    • Prone to both deliberate DoS and architectural failures
    • “A host in a network device world”
    • Redundancy is non-trivial

7 of 15

Controllers

  • Treat a controller like your most important asset
    • Lock it down
    • Hide it
    • Monitor it like it is an infant
    • Alert on anomalies
    • Keep patched
    • Have more than one

8 of 15

Failure modes

  • Management not out of band
  • Fail closed
  • Untested scenarios (i.e. break it before you buy it)
  • Just because it’s “SDN” does not mean

it should not conform to network

architecture best practices and

standards

9 of 15

OpenFlow is a very new protocol

  • Remember early implementations of IPv6 and MPLS, IPv4 Multicast or even mBGP? - Your seasoned network engineers do
  • As protocols mature they get more stable. Early implementations of every protocol have been buggy - SDN (and specifically OpenFLow) is no different

10 of 15

OpenFlow is a very new protocol

  • This also means that these new technologies are not battle tested
  • Security problems we do know about
    • Compromised controller
    • Malicious flowmods
    • Easily snooped control plane (comparatively speaking)
  • Security problems we do not know about

11 of 15

The developer isn’t replacing the engineer

  • ...Short term
  • Long term we’ll have something totally new (or a return to the old ways, depending on length of memory): a network engineer that is also a developer.
  • You can not do production SDN today without 2 critical things:
    • A solid developer that understands real networks
    • A network engineer with great fundamental skills and fantastic troubleshooting methodology
      • Bonus if they also understand code

12 of 15

Controller is not necessarily required

  • SDN does not require a controller
    • ...although the two terms are often linked
    • Puppet
    • Ansible
    • Custom SNMP

13 of 15

What we desperately need

  • More production deployments pushing limits
  • More work, fresh ideas, and proof-of-concept on interdomain SDN
  • More attention to production architectures
  • More sharing of what does work in these scenarios
  • More sharing of what does not work in these scenarios
  • More sharing of everything
  • More collaboration between network researchers and experienced engineers, operators, and architects

14 of 15

SDN: It’s really about...

  • Long term: The mechanism is less important than the result
    • Remove human error
    • Automate repetitive tasks
    • Ease overall management of complicated things
  • Automation, Simplification, Verification (Securification?)

15 of 15

SDN: Theory vs. Practice

Nick Buraglio

buraglio@es.net

Network Engineer, ESnet

Lawrence Berkeley National Laboratory

CODASPY 2016

03/10/2015