1 of 20

kubeadm Cluster Creation Internals

From Self-Hosting to Upgradability and HA

Lucas Käldström7th December 2017 - KubeCon Austin

2 of 20

$ whoami

Lucas Käldström, Upper Secondary School Student, just turned 18

CNCF Ambassador and Certified Kubernetes Administrator

Speaker at KubeCon Berlin 2017 and now here at KubeCon Austin

Kubernetes Maintainer since April 2016, active in the community for +2 years

SIG Cluster Lifecycle co-lead and kubeadm maintainer

Driving luxas labs which currently performs contracting for Weaveworks

A guy that has never attended a computing class

3 of 20

Agenda

  1. What kubeadm is & how it fits into the ecosystem
  2. What’s common for every TLS-secured Kubernetes cluster
  3. What self-hosting is & how you can self-host your cluster
  4. How kubeadm handles upgrades
  5. How high availability can be achieved with kubeadm

This talk will dive deep into the technical aspects of creating Kubernetes clusters

4 of 20

What is kubeadm and why should I care?

= A tool that sets up a minimum viable, best-practice Kubernetes cluster

Master A

Master N*

Node 1

Node N

kubeadm

kubeadm

kubeadm

kubeadm

Cloud Provider

Load Balancers

Monitoring

Logging

Cluster API Spec

Cluster API

Cluster API Implementation

Addons API*

Kubernetes API

Bootstrapping

Machines

Infrastructure

Layer 2

Layer 3

Layer 1

*=Yet to be done/WIP

5 of 20

kubeadm vs kops

Two different projects, two different scopes

kops

Master A

Master N*

Node 1

Node N

kubeadm

kubeadm

kubeadm

kubeadm

Cloud Provider

Load Balancers

Monitoring

Logging

Cluster API Spec

Cluster API Implementation

Addons API*

Kubernetes API

Bootstrapping

Machines

Infrastructure

Cluster API

*=Yet to be done/WIP

6 of 20

Key design takeaways

  • kubeadm’s task is to set up a best-practice cluster for each minor version
  • The user experience should be simple, and the cluster reasonably secure
  • kubeadm’s scope is limited; intended to be a building block
    • Only ever deals with the local filesystem and the Kubernetes API
    • Agnostic to how exactly the kubelet is run
    • Setting up or favoring a specific CNI network is out of scope
  • Composable architecture with everything divided into phases

Audience: build-your-first-own-cluster users & higher-level tools like kops & kubicorn

7 of 20

Kubernetes’ high-level component architecture

Nodes

Master

Node 3

OS

Container

Runtime

Kubelet

Networking

Node 2

OS

Container

Runtime

Kubelet

Networking

Node 1

OS

Container

Runtime

Kubelet

Networking

API Server (REST API)

Controller Manager

(Controller Loops)

Scheduler

(Bind Pod to Node)

etcd (key-value DB, SSOT)

User

8 of 20

What does `kubeadm init` really do -- part 1?

First kubeadm creates the necessary certificates for setting up a cluster with TLS-secured communication and API Aggregation support.

Then client certificates are generated in KubeConfig files for the first actors that need identities

Phase: Certificates (/etc/kubernetes/pki)

API Serving Cert

Root CA Cert

API Kubelet Clientcert

Front Proxy CA Cert

ServiceAccount Private Key

API Clientcert

Phase: Kubeconfig (/etc/kubernetes/*.conf)

Kubelet Clientcert

Root CA Cert

Admin Clientcert

K-C-M Clientcert

Scheduler Clientcert

Needed for all clusters

kubeadm-specific

9 of 20

What does `kubeadm init` really do -- part 2?

kubeadm generates Static Pod manifest files for etcd (or use external etcd) and control plane components

kubelet runs the Static Pods and hence kubeadm assumes a running kubelet

kubeadm waits for the kubelet to start the Static Pods, an operation that may fail in many ways

Phase: Etcd (/etc/kubernetes/manifests)

Local etcd Static Pod, localhost:2379

The running kubelet starts the Static Pods

Phase: Control Plane (/etc/kubernetes/manifests)

API Server, ${MASTER_IP}:6443

Controller Manager, localhost:10251

Scheduler, localhost:10252

Host etcd externally

Needed for all clusters

kubeadm-specific

10 of 20

What does `kubeadm init` really do -- part 3?

In order to make the master exclusive and identifiable, it is tainted and labelled with a common-practice label

For `kubeadm upgrade` to remember the config passed to `kubeadm init`, the config is uploaded to the cluster

A Node Bootstrap Token is created and granted privileges to add a node

Lastly, kube-proxy and kube-dns / CoreDNS are deployed

Phase: Mark the master node

kubectl taint node <master> node-role.kubernetes.io/master=””

Phase: Upload kubeadm Configuration

kubectl label node <master> node-role.kubernetes.io/master=””

kubectl -n kube-system create configmap kubeadm-config --from-file kubeadm.yaml

Phase: Create a Bootstrap Token for a node

kubeadm token create <token>

Phase: Deploy mandatory add-ons

kube-proxy DaemonSet

kube-dns / CoreDNS Deployment

Needed for all clusters

kubeadm-specific

11 of 20

Build your cluster from scratch with `kubeadm phase`

The `kubeadm phase` command lets you invoke atomic sub-tasks of a full installation operation. You don’t have to choose the “full meal deal” �anymore (kubeadm init). This is what `kubeadm init` looks like phase-wise:

$ kubeadm alpha phase certificates all

$ kubeadm alpha phase kubeconfig all

$ kubeadm alpha phase etcd local

$ kubeadm alpha phase controlplane all

$ systemctl start kubelet

$ kubeadm config upload from-flags

$ kubeadm alpha phase mark-master $(kubectl get no --no-headers | awk ‘{print $1}’)

$ kubeadm alpha phase bootstrap-token cluster-info /etc/kubernetes/admin.conf

$ kubeadm alpha phase bootstrap-token node allow-post-csrs

$ kubeadm alpha phase bootstrap-token node allow-auto-approve

$ kubeadm token create

$ kubeadm alpha phase addons kube-dns

$ kubeadm alpha phase addons kube-proxy

12 of 20

Setting up a dynamic TLS-secured cluster

Nodes

Master

API Server

Controller Manager

Scheduler

CN=system:kube-controller-manager

CN=system:kube-scheduler

Kubelet: node-1

HTTPS (6443)

Kubelet client

O=system:masters

Self-signed HTTPS (10250)

CN=system:node:node-1

O=system:nodes

Kubelet: node-2 (to be joined)

Self-signed HTTPS (10250)

Bootstrap Token & trusted CA

CN=system:node:node-2

O=system:nodes

CSR Approver

CSR Signer

Legend:

Logs / Exec calls

Normal HTTPS

POST CSR

SAR Webhook

PATCH CSR

node-1 CSR

node-2 CSR

Bootstrap Token

CSR=Certificate Signing Request, SAR=Subject Access Review

13 of 20

kubeadm & self-hosted Kubernetes cluster

Self-hosting�=using Kubernetes primitives (e.g DaemonSets, ConfigMaps) to run and configure the control plane itself

The self-hosting concept was initially developed by CoreOS and the bootkube team. Also see the other self-hosting talk here at KubeCon.

We’re now in the process of “upstreaming” the work done in bootkube, in a way that makes it easier for any Kubernetes cluster to be self-hosted.

Building blocks that can be used for pivoting to self-hosting exist in kubeadm

14 of 20

How is a self-hosted cluster bootstrapped?

kubeadm either hosts the control plane in Static Pods managed in the filesystem or as self-hosted DaemonSets.

The process is modular; it creates the self-hosted control plane from the current state of the world of Static Pods.

15 of 20

Upgrading clusters with kubeadm

  1. Upgrades made easy with `kubeadm upgrade plan` and `kubeadm upgrade apply`
  2. In v1.8, the upgrades basically shifted Static Pod files around on disk
  3. Self-hosted upgrades in HA clusters work by modifying normal Kubernetes resources
  4. From v1.9 we’re supporting automated downgrades as well
  5. In a future release we’ll look at “runtime reconfiguration”� i.e. invoking an upgrade with new config but the same version number
  6. Read the proposal

16 of 20

The “just works” kubeadm HA feature request

Adding support for joining masters is the most popular feature request

What challenges do we face in making a slick multi-master UX like this?

Three primary challenges:

  • Managing state reliably in an HA etcd cluster with TLS communication
  • Sharing & rotating the common certificates from master A to master B
  • Making the kubelets able to address all masters & dynamic reconfig

kubeadm init

- Add node: kubeadm join --token abcdef.0123456789abcdef 192.168.1.100:6443

- Add master: kubeadm join master --token fedcba.fedcba09876543210 192.168.1.100:6443

17 of 20

How achieve HA with kubeadm today?

HA etcd cluster

External Load Balancer or DNS-based API server resolving

Master A (kubeadm init)

API Server

Controller Manager

Scheduler

Shared certificates

etcd

etcd

etcd

Master B (kubeadm init)

API Server

Controller Manager

Scheduler

Shared certificates

Master C (kubeadm init)

API Server

Controller Manager

Scheduler

Shared certificates

Nodes (kubeadm join)

Kubelet 1

Kubelet 2

Kubelet 3

Kubelet 4

Kubelet 5

Do-it-yourself

  1. Set up HA etcd cluster
  2. Copy certificates from master A to B and C
  3. Set up a loadbalancer�in front of the API servers

18 of 20

One possible (?) solution

Master A, kubeadm init

API Server, DS

Controller Manager, DS

Scheduler, DS

Certificates, Secrets

Kubelet A

etcd-operator, Deploy.

etcd A, Pod

Master B, kubeadm join master

API Server, DS

Controller Manager, DS

Scheduler, DS

Certificates, Secrets

Kubelet B

etcd B, Pod

Master C, kubeadm join master

API Server, DS

Controller Manager, DS

Scheduler, DS

Certificates, Secrets

Kubelet C

etcd C, Pod

Node 2, kubeadm join

Kubelet 2

kube-proxy, DS

Envoy, DS?

Node 1, kubeadm join

Kubelet 1

kube-proxy, DS

Envoy, DS?

Node 3, kubeadm join

Kubelet 3

kube-proxy, DS

Envoy, DS?

DS=DaemonSet

19 of 20

What now?

Follow the SIG Cluster Lifecycle YouTube playlist

Check out the meeting notes for our weekly SIG meetings in Zoom

Join the #sig-cluster-lifecycle (for dev) and #kubeadm (for support)

Check out Diego Pontorierotalk “Self-Hosted Kubernetes: How and Why”

Read the two latest SIG updates on the Kubernetes blog in January and August

Check out the kubeadm setup guide, reference doc and design doc

Read what the kubeadm contribution / work cycle looks like and start contributing to kubeadm!

20 of 20

Thank you!