Kubernetes Scheduling
CS-548, April 2022
Yannis Sfakianakis
Purpose of scheduling in kubernetes?
How does it work?
Node labels – Node selection (Filtering)
kubectl label nodes master on-master=true
Node Affinity (Filtering – Scoring)
apiVersion: v1
kind: Pod
metadata:
name: with-affinity-anti-affinity
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: kubernetes.io/os
operator: In
values:
- linux
preferredDuringSchedulingIgnoredDuringExecution:
- weight: 50
preference:
matchExpressions:
- key: label-1
operator: In
values:
- key-1
containers:
- name: with-node-affinity
image: k8s.gcr.io/pause:2.0
Anti Affinity – Taints and tolerations (Filtering)
kubectl taint nodes node1 key1=value1:NoSchedule
tolerations:
operator: "Equal"
value: "value1"
effect: "NoSchedule"
Resource bin packing (scoring)
Bin packing configuration example
apiVersion: kubescheduler.config.k8s.io/v1beta3
kind: KubeSchedulerConfiguration
profiles:
# ...
pluginConfig:
- name: RequestedToCapacityRatio
args:
shape:
- utilization: 0
score: 10
- utilization: 100
score: 0
resources:
- name: intel.com/foo
weight: 3
- name: intel.com/bar
weight: 5
How score for bin packing is calculated
Requested resources:
intel.com/foo : 2
memory: 256MB
cpu: 2
Resource weights:
intel.com/foo : 5
memory: 1
cpu: 3
Available:
intel.com/foo: 4
memory: 1 GB
cpu: 8
Used:
intel.com/foo: 1
memory: 256MB
cpu: 1
intel.com/foo = resourceScoringFunction((2+1),4)
= (100 - ((4-3)*100/4)
= (100 - 25)
= 75 # requested + used = 75% * available
= rawScoringFunction(75)
= 7 # floor(75/10)
memory = resourceScoringFunction((256+256),1024)
= (100 -((1024-512)*100/1024))
= 50 # requested + used = 50% * available
= rawScoringFunction(50)
= 5 # floor(50/10)
cpu = resourceScoringFunction((2+1),8)
= (100 -((8-3)*100/8))
= 37.5 # requested + used = 37.5% * available
= rawScoringFunction(37.5)
= 3 # floor(37.5/10)
NodeScore = (7 * 5) + (5 * 1) + (3 * 3) / (5 + 1 + 3) = 5
Pod priority
apiVersion: scheduling.k8s.io/v1
kind: PriorityClass
metadata: name: high-priority
value: 1000000
globalDefault: false
description: "This priority class should be used for XYZ service pods only."
Pod priority
apiVersion: v1
kind: Pod
metadata:
name: nginx
labels:
env: test
spec:
containers:
- name: nginx
image: nginx
imagePullPolicy: IfNotPresent
priorityClassName: high-priority
Pod preemption
Node-pressure eviction
Pod selection for eviction
API-initiated Eviction
{
"apiVersion": "policy/v1",
"kind": "Eviction",
"metadata": {
"name": "quux",
"namespace": "default"
}
}
curl -v -H 'Content-type: application/json'
https://your-cluster/api/v1/namespaces/default/pods/quux/eviction -d @evict.json
More details