1 of 47

Please visit the CNF Conformance deck for updated information about this program

© 2020 Cloud Native Computing Foundation

1

2 of 47

CNF Testbed

Dan Kohn

Executive Director, CNCF

Arpit Joshipura

General Manager, LF Networking

© 2020 Cloud Native Computing Foundation

2

3 of 47

TODAY THE LINUX FOUNDATION IS MUCH MORE THAN LINUX

We are helping global privacy and security through a program to encrypt the entire internet.

Security

Networking

We are creating ecosystems around networking to improve agility in the evolving software-defined datacenter.

Cloud

We are creating a portability layer for the cloud, driving de facto standards and developing the orchestration layer for all clouds.

Automotive

We are creating the platform for infotainment in the auto industry that can be expanded into instrument clusters and telematics systems.

Blockchain

We are creating a permanent, secure distributed ledger that makes it easier to create cost-efficient, decentralized business networks.

We are regularly adding projects; for the most up-to-date listing of all projects visit tlfprojects.org

Web

We are providing the application development framework for next generation web, mobile, serverless, and IoT applications.

© 2020 Cloud Native Computing Foundation

3

4 of 47

Cloud Native Computing Foundation

  • Nonprofit, part of the Linux Foundation; founded Dec. 2015
  • Platinum members:

Incubating

Service Mesh

Storage

Service Discovery

Graduated

Package Management

Distributed Tracing API

Messaging

Distributed Tracing

Software Update Spec

Security

Networking API

Orchestration

Network Proxy

Monitoring

Registry

Key/Value Store

Policy

Container Runtime

Container Runtime

Logging

Remote Procedure Call

Key/Value

Store

Storage

Serverless

Container Security

© 2020 Cloud Native Computing Foundation

4

5 of 47

Platinum

Carrier

Members

Projects

Vision LFN software & projects provide platforms and building blocks for Network Infrastructure & Services across Service Providers, Cloud Providers, Enterprises, Vendors, System Integrators that enable rapid interoperability, deployment & adoption

Platinum

Vendor

Members

© 2020 Cloud Native Computing Foundation

5

6 of 47

Evolving from VNFs to CNFs

VNFs

ONAP Orchestrator

OpenStack or VMware

Bare Metal

Azure or Rackspace

Past

VNFs

OpenStack

Bare Metal

Kubernetes

Present

CNFs

ONAP�Orchestrator

Any Cloud

Bare Metal

Any Cloud

Future

VNFs

CNFs

ONAP�Orche-�strator

Kubernetes

KubeVirt/Virtlet

OSS/�BSS

© 2020 Cloud Native Computing Foundation

6

7 of 47

CNF Testbed

  • Open source initiative from CNCF
  • Compare performance of:
    • Virtual Network Functions (VNFs) on OpenStack, and
    • Cloud native Network Functions (CNFs) on Kubernetes
  • Identical networking code packaged as:
    • containers, or
    • virtual machines (VMs)
  • Running on top of identical on-demand hardware from the bare metal hosting company Packet

VNFs

CNFs

BARE-METAL�SERVER

BARE-METAL�SERVER

IDENTICAL HARDWARE

IDENTICAL NETWORKING CODE

OPENSTACK

VIRTUAL MACHINE

VM

#include

#include

KUBERNETES

CONTAINER

© 2020 Cloud Native Computing Foundation

7

8 of 47

Multiple Service Function Chains: Test Cases

OpenStack Node - Snake

Userspace-to-Kernel Dataplane (vSwitch)

VNF

VNF

VNF

VNF

VNF

VNF

vhost-user connections

vhost-user connections

vhost-user connections

Kubernetes Node - Pipeline

Userspace-to-Userspace Dataplane (vSwitch)

CNF

CNF

CNF

CNF

CNF

CNF

memif connections

memif connections

memif connections

Kubernetes Node - Snake

Userspace-to-Userspace Dataplane (vSwitch)

CNF

CNF

CNF

CNF

CNF

CNF

memif connections

memif connections

memif connections

© 2020 Cloud Native Computing Foundation

8

9 of 47

Summary of Results: Snake and Pipeline Case

Throughput of Service Chains

Millions of packets per second in 3 chain, 2 NF configuration

(bigger is better)

© 2020 Cloud Native Computing Foundation

9

10 of 47

Testbed stats comparison

OpenStack

Kubernetes

Infra deploy time

~65 minutes

16 minutes*

NF deploy time

3 minutes, 39 seconds

< 30 seconds

Idle state RAM

17.8%

5.7%

Idle state CPU

7.2%

0.1%

Runtime NF RAM

17.9%

10.7%

Runtime NF CPU

28.8%

39.1%

Snake case PPS

3.97 million PPS

4.93 million PPS

Snake case latency

~2.1 milliseconds

~2.1 milliseconds

Pipeline case PPS

N/A

7.04 million PPS

* Will go down when we eliminate a currently-required reboot

© 2020 Cloud Native Computing Foundation

10

11 of 47

How Can You Engage?

  • Have your engineers replicate our results from github.com/cncf/cnf-testbed with an API key from packet.com/cnf
  • Create pull requests to improve Kubernetes or OpenStack deployments
  • Create pull requests to have the CNF Testbed run on your bare metal servers or other cloud bare metal servers like AWS i3.metal
  • Package your internal network functions into VNFs and CNFs and run on your instance of the testbed
    • We don’t need to see the code but would love to see the results
  • Help improve performance running CNFs on top of virtualized hardware

© 2020 Cloud Native Computing Foundation

11

12 of 47

Telecom User Group (TUG)

  • CNCF is launching the Telecom User Group (TUG) for telcos and their vendors who are using or aiming to use cloud native technologies in their networks.
  • The TUG will operate in a similar capacity to CNCF's End User Community (since telcos have never been included in CNCF's definition of end users). Unlike the End User Community, telecom vendors are also encouraged to participate in the TUG.
  • The TUG is not expected to do software development, but may write up requirements, best practices, gap analysis, or similar documents.

© 2020 Cloud Native Computing Foundation

12

13 of 47

KubeCon + CloudNativeCon

  • KubeCon + CloudNativeCon Europe 2020
    • Amsterdam: TBD

© 2020 Cloud Native Computing Foundation

13

14 of 47

KubeCon + CloudNativeCon Attendance

© 2020 Cloud Native Computing Foundation

14

15 of 47

Appendix

© 2020 Cloud Native Computing Foundation

15

16 of 47

Combating FUD Around MicroVMs

  • There has been a lot of Fear, Uncertainty, and Doubt (FUD) about the value of MicroVMs and similar sandbox technology
  • Micro virtual machine and sandbox technologies – including Firecracker, gVisor, Kata, Nabla, Singularity, and Unik – are promising options to run untrusted code securely on a cluster
  • MicroVMs are not necessary to address the noisy neighbor issue; that’s what the core Kubernetes features resource limits (for CPU and memory) and QoS (for networking) are for
  • More generally, telcos are running their own (1st party) code or trusted vendor (2nd party) code, not untrusted 3rd party code

© 2020 Cloud Native Computing Foundation

16

17 of 47

Network Labs (pets) vs. Repeatable Testbed (cattle)

  • Networking equipment used to be separate hardware boxes that needed to be integrated in a lab for testing
  • Most network labs today are still a group of carefully tended pets whose results cannot be reliably reproduced
  • Modern networking is mainly done in software which can and should be checked into source control and replicated at any time
  • Network servers should be treated like cattle, not pets

© 2020 Cloud Native Computing Foundation

17

18 of 47

The Importance of a Repeatable Testbed

  • A key driver of the Kubernetes project’s robustness has been the significant investment in continuous integration (CI) resources
    • Every pull request runs a large automated test suite
    • On any given weekday, we run 10,000 CI jobs
    • Every 2 days, we run a new scalability test of 150,000 containers across 5,000 virtual machines
    • Google provided CNCF a $9 M grant of cloud credits to cover 3 years of testing
  • The CNF Testbed is a completely replicable platform for doing apples-to-apples networking comparisons of CNFs and VNFs

© 2020 Cloud Native Computing Foundation

18

19 of 47

Three Major Benefits

  1. Cost savings
  2. Improved resiliency (to failures of individual CNFs, machines, and even data centers)
  3. Higher development velocity

© 2020 Cloud Native Computing Foundation

19

20 of 47

Server Specifications: compute/worker nodes

Packet’s M2.xlarge (currently available)

  • CPU: 2 x Intel® Xeon® Gold 5120 Processor (28 physical cores)
  • RAM: 384 GB of DDR4 ECC RAM
  • Storage: 3.2 TB of NVMe Flash and 2 × 120 GB SSD
  • NIC: 10GB dual-port Mellanox ConnectX-4

Packet’s N2.xlarge (available March 2019)

  • CPU: 2 x Intel® Xeon® Gold 5120 Processors (28 physical cores)
  • Same RAM and storage as the M2-xlarge
  • NIC: 10GB quad-port Intel X710

© 2020 Cloud Native Computing Foundation

20

21 of 47

Why This Was a Challenging Project: OpenStack

  • No existing 100% open source OpenStack installer w/baked-in high-performance dataplane
  • Limited choices for high-performance Layer-2 dataplane: SR-IOV, VPP, OVS+DPDK
  • OpenStack VPP-networking setup was not well documented
  • VPP-Neutron plugin did not support standard OVS setup and configuration (eg. multiple port creation)
  • VNF test case deployment configuration with OpenStack-VPP
  • Apples-to-apples layer-2 underlay network for K8s
  • Physical hardware - Mellanox NICs and proprietary drivers
  • Provider limitations - no spanning tree support

© 2020 Cloud Native Computing Foundation

21

22 of 47

Why This Was a Challenging Project: Kubernetes

  • Support for dropping in different data plane solutions
  • No Kubernetes installer w/baked-in high-performance data plane underlay was available
  • No CNI plugins which provide a high-performance layer-2 underlay were available
  • Network Service Mesh is a promising approach to dynamically configure the layer 2 network that is currently being manually configured, but it doesn’t yet meet our needs
  • Physical hardware - Mellanox NICs and proprietary drivers
  • Provider limitations - no spanning tree support

© 2020 Cloud Native Computing Foundation

22

23 of 47

The challenge of transitioning VNFs to CNFs

  • Moving from network functionality from physical hardware to encapsulating the software in a virtual machine (P2V) is generally easier than containerizing the software (P2C or V2C)
  • Many network function virtualization VMs rely on kernel hacks or otherwise do not restrict themselves to just the stable Linux kernel userspace ABI
    • They also often need to use DPDK or SR-IOV to achieve sufficient performance
  • Containers provide nearly direct access to the hardware with little or no virtualization overhead
    • But they expect containerized applications to use the stable userspace Linux kernel ABI, not to bypass it

© 2020 Cloud Native Computing Foundation

23

24 of 47

Areas for More Discussion

  • The strength of no longer being locked into specific OSs
    • Any version of Linux >3.10 is acceptable
  • Multi-interface pods vs. Network Service Mesh
  • Complete parity for IPv6 functionality and dual-stack support in K8s
  • Security, and specifically recommendations from Google and Jess that come into play when hosting untrusted, user-provided code
    • Possible use of isolation layers such as Firecracker, gVisor, or Kata
  • Scheduling container workloads with network-related hardware constraints (similar to what’s been done for GPUs)
    • Network-specific functionality like traffic-shaping

© 2020 Cloud Native Computing Foundation

24

25 of 47

A Service Function Chain: Snake Case

OpenStack Node

Kubernetes Node

Userspace-to-Kernel Dataplane (vSwitch)

Userspace-to-Userspace Dataplane (vSwitch)

VNF

VNF

VNF

CNF

CNF

CNF

CNF

VNF

memif connections

vhost-user connections

© 2020 Cloud Native Computing Foundation

25

26 of 47

A Service Function Chain: Pipeline Case

OpenStack Node

Kubernetes Node

Userspace-to-Kernel Dataplane (vSwitch)

Userspace-to-Userspace Dataplane (vSwitch)

VNF

VNF

VNF

CNF

CNF

CNF

CNF

VNF

memif connections

vhost-user connections

© 2020 Cloud Native Computing Foundation

26

27 of 47

Multiple Service Function Chains: Snake Case

OpenStack Node

Kubernetes Node

Userspace-to-Kernel Dataplane (vSwitch)

Userspace-to-Userspace Dataplane (vSwitch)

VNF

VNF

VNF

VNF

VNF

VNF

CNF

CNF

CNF

CNF

CNF

CNF

vhost-user connections

vhost-user connections

vhost-user connections

memif connections

memif connections

memif connections

© 2020 Cloud Native Computing Foundation

27

28 of 47

Multiple Service Function Chains: Pipeline Case

OpenStack Node

Kubernetes Node

Userspace-to-Kernel Dataplane (vSwitch)

Userspace-to-Userspace Dataplane (vSwitch)

VNF

VNF

VNF

VNF

VNF

VNF

CNF

CNF

CNF

CNF

CNF

CNF

vhost-user connections

vhost-user connections

vhost-user connections

memif connections

memif connections

memif connections

© 2020 Cloud Native Computing Foundation

28

29 of 47

Network Architecture Evolution

  • 1.0: Separate physical boxes for each component (e.g., routers, switches, firewalls)

© 2020 Cloud Native Computing Foundation

29

30 of 47

Network Architecture 1.0

© 2020 Cloud Native Computing Foundation

30

31 of 47

Network Architecture Evolution

  • 2.0: Physical boxes converted to virtual machines called Virtual Network Functions (VNFs), often running on OpenStack

© 2020 Cloud Native Computing Foundation

31

32 of 47

Network Architecture 2.0

© 2020 Cloud Native Computing Foundation

32

33 of 47

Network Architecture Evolution

  • 3.0: Cloud-native Network Functions (CNFs) running on Kubernetes on public, private, or hybrid clouds

© 2020 Cloud Native Computing Foundation

33

34 of 47

Network Architecture 3.0

(hardware is the same as 2.0)

© 2020 Cloud Native Computing Foundation

34

35 of 47

Evolving from VNFs to CNFs (Past)

Past

VNFs

ONAP Orchestrator

OpenStack or VMware

Bare Metal

Azure or Rackspace

© 2020 Cloud Native Computing Foundation

35

36 of 47

Evolving from VNFs to CNFs (Present)

Present

VNFs

OpenStack

Bare Metal

Kubernetes

CNFs

ONAP�Orchestrator

Any Cloud

© 2020 Cloud Native Computing Foundation

36

37 of 47

Evolving from VNFs to CNFs (Future)

Future

Bare Metal

Any Cloud

VNFs

CNFs

ONAP�Orchestrator

Kubernetes

KubeVirt/Virtlet

OSS/�BSS

© 2020 Cloud Native Computing Foundation

37

38 of 47

Technical Appendix

© 2020 Cloud Native Computing Foundation

38

39 of 47

CNF Testbed Deployment stages

Common steps

Clone https://github.com/cncf/cnf-testbed and install any pre-requites listed in the README

Create configuration with Packet API, number of nodes, etc (k8s example)

Run the (k8s or openstack) deploy cluster script which provisions the Packet machines with Terraform

OpenStack

Kubernetes

Terraform starts Ansible which pre-configures the Packet machines (using the openstack infrastructure playbook) including installing network drivers, optimizing grub and rebooting the compute nodes.

Cloud-init bootstraps the Kubernetes cluster on the Packet nodes. �

(Note: next release will use kubeadm for bootstrapping k8s)

Ansible then runs the openstack install playbook, which configures the Packet switch and VLANs and then deploys OpenStack using Chef to the Packet nodes

Ansible then installs & configures VPP as a vSwitch using the Openstack vpp-networking plugin to all compute nodes in the cluster

Ansible then optimizes the system configuration, installs & configures the VPP vSwitch and reboots the worker nodes

© 2020 Cloud Native Computing Foundation

39

40 of 47

CNF vs. VNF Performance Comparison

The comparison test bed includes multi-node HA clusters for Kubernetes and OpenStack running chained dataplane CNF and VNFs for performance comparison testing. All software is open source. The entire test bed and comparison results can be recreated by following step-by-step documentation on the CLI with a Packet.net account.

Each test bed will consist of 6 physical machines for each platform - OpenStack and Kubernetes.

  • OpenStack - 2 controllers and 3 compute nodes
  • Kubernetes - 2 masters and 3 worker nodes
  • Traffic generator - 1 NFVbench system

Provisioning and deployment of K8s and OpenStack clusters includes use of Terraform, Ansible, and Kitchen/Chef. Network functions primarily use VPP and performance testing is done with NFVbench with TRex as the traffic generator.

© 2020 Cloud Native Computing Foundation

40

41 of 47

OpenStack Cluster + Traffic generator

Controller 1

Controller 2

Compute 1

Compute 2

Compute 3

Provider Switch

Traffic

generator

Packet layer 2 network

© 2020 Cloud Native Computing Foundation

41

42 of 47

Kubernetes Cluster + Traffic generator

Master 1

Master 2

Worker 1

Worker 2

Worker 3

Provider Switch

Traffic

generator

Packet layer 2 network

© 2020 Cloud Native Computing Foundation

42

43 of 47

Vhost-user vs memif�Stay in memory & stay in user space!

K8s Node (Physical Host)

Container Runtime

QEMU Layer

virtio

VNF

CNF

User Space

vNIC1

Kernel Space

vNIC2

Kernel Space

User Space

Kernel Space

User Space

vhost-user

P NIC1

P NIC2

P NIC1

P NIC2

VPP vSwitch

DPDK

memif1

memif2

VPP vSwitch

DPDK

OpenStack node

K8s node

© 2020 Cloud Native Computing Foundation

43

44 of 47

CNF Testbed Software components

Kernel 4.4.0-134

DPDK

memif

VPP vSwitch

QEMU/KVM

VPP Neutron Agent

VPP IP Router

Kernel 4.4.0-134

DPDK

memif

VPP vSwitch

K8s v1.12.2

VPP IP Router

VPP IP Router

Ubuntu 18.04 LTS

Ubuntu 18.04 LTS

VPP IP Router

VPP IP Router

VPP IP Router

vhost-user

Kernel 4.4.0-134

Docker

Ubuntu 18.04 LTS

Kernel 4.4.0-134

K8s v1.12.2

Kernel 4.4.0-134

OS “rocky” services�Neutron, API

etcd

OpenStack compute

OpenStack controller

Kubernetes worker

Kubernetes master

Traffic generator

Packet API

© 2020 Cloud Native Computing Foundation

44

45 of 47

What about inter-node connectivity?

Node #1

Dataplane (vswitch)

Node #2

Dataplane (vswitch)

Node #3

Dataplane (vswitch)

Node #N

Dataplane (vswitch)

© 2020 Cloud Native Computing Foundation

45

46 of 47

vCPE Use Case

© 2020 Cloud Native Computing Foundation

46

47 of 47

Project links

© 2020 Cloud Native Computing Foundation

47