Background
https://docs.google.com/document/d/1Ny03h6IDVy_e_vmElOqR7UdTPAG_RNydhVE1Kx54kFQ/edit
Admin creates control and data networks
Network object should be annotated with the “kubernetes.v1.cni.cncf.io/resourceName” annotation.
apiVersion: "kubernetes.cni.cncf.io/v1"
kind: Network
metadata:
name: n1-ctr-net
annotations:
kubernetes.v1.cni.cncf.io/resourceName: abc-plugin.io/ctr-net
spec:
plugin: abc-plugin
apiVersion: "kubernetes.cni.cncf.io/v1"
kind: Network
metadata:
name: n1-data-net
annotations:
kubernetes.v1.cni.cncf.io/resourceName: abc-plugin.io/data-net
spec:
plugin: abc-plugin
Network Creation Flow
1. Admin creates network
2. CRD network object gets persisted at API server
3. Plugin observes network creation
4. Using details in the network object and its own local state, Plugin advertises network availability on ListWatch DPAPI. On node1 and node2, Ex: abc-plugin.io/ctr-net
5. Kubelet updates node status for node1 and node2. abc-plugin.io/ctr-net: “1”
Api Server
Default
Scheduler
Device Manager
Node
Pods
cni-dp
Admission controllers
Kubelet
CNI
1
2
Networks
3
4
5
How to request network attachment
kind: Pod
metadata:
name: N1-Pod
namespace: N1-namepsace
annotations:
kubernetes.v1.cni.cncf.io/networks: ctr-net, data-net
NetworkResource admission controller plugin
limits:
abc-plugin.io/data-net: “1”
abc-plugin.io/ctr-net: “1”
Pod Object After Admission Controller
kind: Pod
metadata:
name: N1-Pod
namespace: N1-namepsace
annotations:
kubernetes.cni.cncf.io/v1/networks: ctr-net, data-net
kubernetes.cni.cncf.io/v1/contextID: 1234-56-7890-234234-456456
spec:� containers:� - name: myapp-container� image: busybox
resources:� requests:� abc-plugin.io/data-net: “1”
abc-plugin.io/ctr-net: “1”
limits:� abc-plugin.io/data-net: “1”
abc-plugin.io/ctr-net: “1”
Pod Creation Flow
1. User triggers pod creation
2. NetworkResource Adm controller mutates resource request and annotates pod with a contextUID
3. Pod object entry gets created at API server
4. Scheduler decides one of the node1 or node2 for the pod, based on the Resource.Limits
5. On the node, Kubelet starts pod Admission and eventually, within Admit phase, Kubelet (Device Manager) invokes DPAPI “Allocate”, i.e RPC call to the cni-dp
Allocate(annotations, dev-id)
AllocateResponse is sent to kubelet which includes mountpaths and env variables
6. Plugin finds out context UID from pod annotations and locally stores contextUID-to-dev-id mapping
Api Server
Default
Scheduler
Device Manager
Node
Pods
Admission controllers
Kubelet
CNI
cni-dp
1
4
Networks
5
2
3
7
6
meta-plugin(ex: multus)
daemonset
cni-static-binary
Unix socket
8
9
10
7. CNI invokes ADD to get network configuration done for the pod. ADD/DEL will be handled by a meta-plugin. Example of meta-plugin is Multus, where role of meta plugin is:
8. Meta-plugin finally invokes abc-plugin, stateless static executable for ADD/DEL
9. Binary executable passes context-UID (annotation kubernetes.cni.cncf.io/v1/contextID) to the cni-dp plugin daemonset. This could be unix domain socket based communication.
10. cni-dp , which implements DPAPI as well, maintains contextUID-to-dev-id mappings (step 6). Cni-dp provides the interface name to the executable which is required to get pod connected to requested network.
NOTE: meta-plugin, cni plugin daemon and device plugin can be a single multi-threaded process or these can be separate processes, in that case communication mechanism will be needed between cni-plugin daemon and device plugin.
I would prefer to run cni-plugin daemon and device plugin(DPAPI server) in a single process to reduce too many moving pieces. But thats not really a goal for this discussion.
Pod Deletion Flow