KVM Forum 2024, Brno, Czech:�Unleashing SR-IOV�on Virtual Machines
Yui Washizu
NTT Open Source Software Center�yui.washidu@gmail.com
Akihiko Odaki
Daynix Computing Ltd.�akihiko.odaki@daynix.com
Introduction
Unleashing SR-IOV on Virtual Machines
2
Multi-tenant cloud environments
Two goals of multi-tenant cloud environments:
3
Single Root I/O Virtualization (SR-IOV)
Achieve the two goals of multi-tenant cloud environment
4
Problem with offloading container networks on VMs
Containers on VMs require their own virtual network
5
Deploying to Physical Machine
Deploying to Virtual Machine
Container
SmartNIC
PF
Physical Server
(Host)
VF
Container
SmartNIC
PF
Physical Server
(Host)
VF
VF’s VF ?
VM (Container host)
Admin or Network Construction Software
Can access PF and control NW
Admin or Network Construction Software
Can’t access PF or control NW
Our proposal
Unleashing SR-IOV on Virtual Machines
6
Proposal: SR-IOV emulation
Emulate SR-IOV with VMM
7
Container
SmartNIC
PF
Physical Server
(Host)
Virtual PF
Virtual VF
VM (Container Host)
Admin or Network Construction Software
Can access (virtual) PF�and configure
virtual NW !
VF
VF
Advantages of SR-IOV emulation
8
Avoiding emulation overhead with vDPA
SR-IOV emulation solely governs the control path
9
Physical�Server
VM
Virtual PF
Virtual VF
SmartNIC
VF
VF
Offloaded L2SW
PF
QEMU
vdpa
virtio-net
vdpa
virtio-net
Physical�Server
VM
Virtual PF
Virtual VF
SmartNIC
VF
VF
Offloaded L2SW
PF
QEMU
Control Path
Data Path
With SR-IOV emulation
Adapting SR-IOV emulation to other use cases
Replace backends for other use cases
10
Physical�Server
VM
Virtual PF
Virtual VF
SmartNIC
VF
VF
Offloaded L2SW
PF
QEMU
vdpa
virtio-net
vdpa
virtio-net
Physical�Server
VM
Virtual PF
Virtual VF
Host OS
QEMU
For acceleration
For testing & debugging
tap
virtio-net
tap
virtio-net
TAP
TAP
Linux Bridge
Interface for virtio SR-IOV embedded switch
virtio 1.3 will expose SR-IOV embedded switch capability as device groups
11
Future work: offload packet switching
Provide comprehensive solution of NW offloading on VMs with OVS
12
Physical Server
VM�
Virtual VF
SmartNIC
VF
VF
Offloaded L2SW
PF
QEMU
vdpa
virtio-net
vdpa
virtio-net
Virtual PF
OVS
tc command
PF Driver
TC Flower
Bridge (for guest)
ovs-ofctl�add-flow
Configure offloading
Performance Verification
Unleashing SR-IOV on Virtual Machines
13
Verification target’s setup
Confirmed 2x performance improvement with vDPA in the following setup:
14
(*) with slight modification to adapt to virtio’s sysfs
Server model | HPE ProLiant DL360 Gen10 |
CPU | Intel Xeon CPU 4210R @2.4GHz |
NIC | Mellanox Technologies MT27710 family ConnectX-6 Dx (100G) |
Host/Guest OS | Rocky Linux 9.2 |
QEMU version | QEMU 8.1.1 (w/ virtio SR-IOV emulation patch applied) |
Kubernetes version | 1.27.6 |
CNI plugin | Calico v3.26.3 |
SR-IOV CNI plugin | 2.7.0 (*) |
netperf | 2.7 |
Environments
Compare the following 2 environments
15
Without offloading in VM
With offloading in VM
Physical Server
VM
Container
Physical NIC (*)
VF
virtual VF
vDPA
SR-IOV CNI
Physical Server
VM
Container
Physical NIC
virtio-net
Device Assignment
Kubernetes NW through Software
VF
External Server
External Server
(*) The host configures L2 packet switching
Metrics and section
Verification metrics are throughput and latency
16
Without offloading in VM
With offloading in VM
Physical Server
VM
Container
Physical NIC (*)
VF
virtual VF
vDPA
SR-IOV CNI
Physical Server
VM
Container
Physical NIC
virtio-net
Device Assignment
Kubernetes NW through Software
VF
External Server
External Server
Measurement Section
Measurement Section
(*) The host configures L2 packet switching
Throughput
17
Latency
18
Development of SR-IOV Emulation in QEMU
Unleashing SR-IOV on Virtual Machines
19
History of SR-IOV emulation
2014: igb patch series by Knut Omang
igb: a network device (Intel 82576)
2019: Virtio version 1.1 specification with SR-IOV support
2022: nvme upstreamed by Lukasz Maniak
2023: igb upstreamed by Akihiko Odaki (details on daynix.github.io)
2023: virtio-net-pci RFC by Yui Washizu
2024: virtio-net-pci for upstreaming by Akihiko Odaki (in progress)
20
Adding SR-IOV to virtio-net-pci
Paravirtualized
Challenge: flexible configuration
21
Conventional PCI multifunction
Just specify multifunction and�addr:
-netdev user,id=n -netdev user,id=o� -netdev user,id=p -netdev user,id=q� -device pcie-root-port,id=b� -device virtio-net-pci,netdev=q,bus=b,� addr=0x0.0x3� -device virtio-net-pci,netdev=p,bus=b,� addr=0x0.0x2� -device virtio-net-pci,netdev=o,bus=b,� addr=0x0.0x1� -device virtio-net-pci,netdev=n,bus=b,� addr=0x0.0x0,multifunction=on
22
Composable SR-IOV device
Add the sriov-pf property:
-netdev user,id=n -netdev user,id=o� -netdev user,id=p -netdev user,id=q� -device pcie-root-port,id=b� -device virtio-net-pci,netdev=q,bus=b,� addr=0x0.0x3,sriov-pf=f� -device virtio-net-pci,netdev=p,bus=b,� addr=0x0.0x2,sriov-pf=f� -device virtio-net-pci,netdev=o,bus=b,� addr=0x0.0x1,sriov-pf=f� -device virtio-net-pci,netdev=n,bus=b,� addr=0x0.0x0,id=f
The implementation is a bit more�complicated though.
23
SR-IOV as guest-controlled hotplugging
The VF lifetime is controlled by the guest
Similar to hotplug, but the guest expects:
…because physical devices do
24
Issues with hotplugging
Today: literally hotplugging VFs as the guest requests.
-netdev user,id=n -netdev user,id=o� -netdev user,id=p -netdev user,id=q� -device pcie-root-port,id=b� -device virtio-net-pci,bus=b,addr=0x0.0x3,netdev=q,sriov-pf=f� -device virtio-net-pci,bus=b,addr=0x0.0x2,netdev=p,sriov-pf=f� -device virtio-net-pci,bus=b,addr=0x0.0x1,netdev=o,sriov-pf=f� -device virtio-net-pci,bus=b,addr=0x0.0x0,netdev=n,id=f
25
Avoiding hotplugging
[PATCH v16 00/13] hw/pci: SR-IOV related fixes and improvements:
26
Validation of SR-IOV device configuration
SR-IOV imposes several restrictions:
To satisfy these requirements:
27
Summary
Unleashing SR-IOV on Virtual Machines
28
Summary
29
Introduction
Unleashing SR-IOV on Virtual Machines
30
Multi-tenant cloud environments
Two goals of multi-tenant cloud environments:
31
Single Root I/O Virtualization (SR-IOV)
A single PCIe device presents Physical and Virtual Functions
32
The problem with offloading container networks on VMs
VMs cannot control networks by accessing the SR-IOV PF
33
Deploying to physical machine
Deploying to virtual machine
Container
SmartNIC
PF
Physical server
(host)
VF
Container
SmartNIC
PF
Physical server
(host)
VF
VF’s VF ?
VM (container host)
User or Network construction software
Can access PF and control NW
User or Network construction software
Can’t access PF and control NW
Our proposal
Unleashing SR-IOV on Virtual Machines
34
Proposal: SR-IOV emulation
Emulate PCIe devices with SR-IOV on QEMU
35
Container
SmartNIC
PF
Physical Server
(Host)
Virtual PF
Virtual VF
VM (Container host)
User or Network Construction Software
Can access (virtual) PF�and configure
virtual NW !
VF
VF
Advantage of SR-IOV emulation
36
Offload container networks with SR-IOV emulation
Combine with Virtio Data Path Acceleration (vDPA)
37
Physical�Server
VM
Virtual PF
Virtual VF
SmartNIC
VF
VF
Offloaded L2SW
PF
QEMU
vdpa
virtio-net
vdpa
virtio-net
Physical�Server
VM
Virtual PF
Virtual VF
SmartNIC
VF
VF
Offloaded L2SW
PF
QEMU
Control Plane
Data Plane
with SR-IOV emulation
Adapting SR-IOV emulation to other use cases
Replace backends for other use cases
38
Physical�Server
VM
Virtual PF
Virtual VF
SmartNIC
VF
VF
Offloaded L2SW
PF
QEMU
vdpa
virtio-net
vdpa
virtio-net
Physical�Server
VM
Virtual PF
Virtual VF
SmartNIC
tap
tap
Bridge
QEMU
For acceleration
For testing & debugging
vhost
virtio-net
vhost
virtio-net
Implement device groups for virtio with SR-IOV
Help to implement the features related to virtio SR-IOV
39
Future work: offload advanced NW features
Potential solution for advanced network features on VMs with OVS
40
Physical server
VM�
Virtual VF
SmartNIC
VF
VF
Offloaded L2SW
PF
QEMU
vdpa
virtio-net
vdpa
virtio-net
Virtual PF
rep
OVS
tc command
PF driver
rep
tc flower
bridge (for guest)
ovs-ofctl�add-flow
Configure offloading
Performance Verification
Unleashing SR-IOV on Virtual Machines
41
Verification target’s setup
We confirmed performance improvement with vDPA in the following setup:
42
(*) with slight modification to adapt to virtio’s sysfs
Server model | HPE ProLiant DL360 Gen9 |
CPU | Intel Xeon CPU E5-2600 @2.3GHz |
NIC | Mellanox Technologies MT27710 family ConnectX-6 Dx (100G) |
Host/Guest OS | Rocky Linux 9.2 |
QEMU version | QEMU 8.1.1 (w/ virtio SR-IOV emulatiopatch applied) |
Kubernetes version | 1.27.6 |
CNI plugin | Calico v3.26.3 |
SR-IOV CNI plugin | 2.7.0 (*) |
netperf | 2.7 |
Environments
Performance verification in the following 2 environments
43
Without offloading in VM
With offloading in VM
Physical Server
VM
Container
Physical NIC
VF
virtual VF
vDPA
Offloading with SR-IOV
Physical Server
VM
Container
Physical NIC
virtio-net
PCI device allocation
Kubernetes NW through software(*)
VF
External Server
External Server
(*) including firewall and NAT
Metrics and section
Verification metrics are throughput and latency
44
Without offloading in VM
With offloading in VM
Physical Server
VM
Container
Physical NIC
VF
virtual VF
vDPA
Offloading with SR-IOV
Physical Server
VM
Container
Physical NIC
virtio-net
PCI device allocation
Kubernetes NW through software(*)
VF
External Server
External Server
measured section
measured section
Throughput
45
Y
x 2
Y
x 2
Results [Mbps]
Latency
46
Y
- 100 μsec
Result of single core [μsec]