K8s autoscaling
Why we need to scale
What kind of scale we could do in K8s
Resource for Scale
Resource Type: CPU and memory are each a resource type. A resource type has a base unit.
CPU resource units
Limits and requests for CPU resources are measured in cpu units. 1 CPU unit = 1000 millicpu.
Memory resource units
Limits and requests for Memory are measured in bytes. You can express memory as a plain integer or as a fixed-point number using one of these quantity suffixes: E, P, T, G, M, k.
Resource Example
How to manually Scale
Ways to scale pods
its underlying Replica Sets and their Pods.
Example: kubectl scale deploy/application-cpu --replicas 2
Add new Kubernetes Worker Node Manually
Pre-requisites to bring up worker node
Add new Kubernetes Worker Node Manually
kubeadm token create --print-join-command
openssl x509 -pubkey -in /etc/kubernetes/pki/ca.crt | openssl rsa -pubin -outform der 2>/dev/null | openssl dgst -sha256 -hex | sed 's/^.* //'
kubectl cluster-info
kubeadm join <control-plane-host>:<control-plane-port> --token <token> --discovery-token-ca-cert-hash sha256:<hash>
Why we need autoscaling
Revisit Scale Example
Without an autoscaling solution in place, the traditional approach to mitigating such scalability failures involves:
Problems for Human Touch
How can autoscaling help?
What kind of autoscale we could do in K8s
Pod Level Scale
VPA
VPA
kubectl describe vpa application-cpu
vpa.yaml
deployment.yaml
VPA
VPA Limitations
Recommend 18 GB memory, but node only have 16 GB Memory -> pod pending all the time
HPA
HPA
TargetNumOfPods = ceil(sum(CurrentPodsCPUUtilization) / Target)
HPA Example
HPA Limitation
Metrics Server
What’s it for
Metrics Server collects resource metrics from Kubelets and exposes them in Kubernetes apiserver through Metrics API for use by Horizontal Pod Autoscaler and Vertical Pod Autoscaler.
What isn’t it for
Metrics Server
Node Level Scale
Cluster Autoscaler
Cluster Autoscaler Behavior
c444e24e3915@cloudshell:~ (qwiklabs-gcp-03-a94f05d7b8a0)$ kubectl get nodes
NAME STATUS ROLES AGE VERSION
gke-scaling-demo-default-pool-b182e404-5l2v Ready <none> 6m v1.20.8-gke.900
c444e24e3915@cloudshell:~ (qwiklabs-gcp-03-a94f05d7b8a0)$ kubectl scale deploy/application-cpu --replicas 2
deployment.apps/application-cpu scaled
c444e24e3915@cloudshell:~ (qwiklabs-gcp-03-a94f05d7b8a0)$ kubectl get pods
NAME READY STATUS RESTARTS AGE
application-cpu-7879778795-8t8bn 1/1 Running 0 2m29s
application-cpu-7879778795-rzxc7 0/1 Pending 0 5s
Cluster Autoscaler Behavior
Scale Up:
Pod Event
Node Event
Cluster Autoscaler Behavior
ScaleDown:node removed by cluster autoscaler
NodeNotReady
Deleting node
RemovingNode:Removing Node from Controller.
Scale Down:
Cluster Autoscaler Limitations
Reference