1 of 67

Exploring modern and secure operations of Kubernetes clusters �on the Edge

Lucas Käldström - CNCF Ambassador

9th of June, 2021 - Open Data Science Conference (Online)

Image credit: @ashleymcnamara

2 of 67

$ whoami

Lucas Käldström, 2nd-year student at Aalto, 21 yo

CNCF Ambassador, Certified Kubernetes Administrator and former Kubernetes WG/SIG Lead�

KubeCon Speaker in Berlin, Austin, Copenhagen, �Shanghai, Seattle, Barcelona (Keynote) & San Diego�

Former Kubernetes approver and subproject owner, active in the community for 5+ years.�Worked on e.g. SIG Cluster Lifecycle => kubeadm to GA.

Weaveworks contractor since 2017Weave Ignite and Racklet co-author

Cloud Native Nordics co-founder & meetup organizer

@kubernetesonarm

3 of 67

Warning: This talk does not talk about MLOps directly, it tells you what to look into to be able to do MLOps on the edge well

@kubernetesonarm

4 of 67

  1. Secure the boot process

Part 1: Edge Security

@kubernetesonarm

5 of 67

What’s complex with booting a machine?

  1. Correctness and openness of the boot binaries
    1. If the binary is proprietary, how can you be sure it doesn’t have a backdoor or CVE?�
  2. Duplication of important drivers across bootloaders
    • Most bootloaders are written in C, and implement e.g. a UDP driver by themselves�
  3. Introspection of the boot flow
    • Ensure that the right (read: not malicious) binaries were used, in the right order�
  4. Remote verification of an edge node being safe
    • How can a remote actor ensure that a remote node is in a well-known state?

@kubernetesonarm

6 of 67

High-level boot steps on embedded devices

  1. On-chip bootloader in ROM, unchangeable
    1. This code is fixed at manufacture-time, and often has very limited functionality�
  2. Hardware initialization bootloader step
    • Loaded from e.g. SPI Flash, EEPROM, or SD Card by 1).
    • Often subject to tight size constraints in the CPU cache/SRAM, e.g. 512 kB.
    • Initializes (D)RAM. May run the “ARM stub”, a short piece of hardware init code.
    • Open-source projects include u-boot SPL and Coreboot.�
  3. Raw executable to load into RAM
    • Can be Linux or some other “bare-metal” executable like TianoCore EDK2 or u-boot
    • Loaded into the beginning of RAM and executed by 2)

@kubernetesonarm

7 of 67

Open Source Bootloaders

Coreboot is a bootloader popular with x86�devices, but also supports ARM and other platforms

Oreboot is like Coreboot, but all C is removed for Rust��u-boot is the most popular bootloader for ARM SBCs

Join the Open Source Firmware movement!

===> https://slack.osfw.dev/

@kubernetesonarm

8 of 67

Example: Raspberry Pi 4

On-chip ROM

EEPROM�bootloader�< 512kB

TFTP

SD Card

USB

SSD

start.elf�bootloader�~2MB

config.txt

Device Tree

ARM stub

RAM initialized

CPUs set-up

“Boot files”

BCM GPU/SoC

CPU

*kernel8.img

Closed Source

Open Source

kernel8.img

Complex custom logic

What now?

* Can be Linux, TianoCore EDK2 and/or u-boot

@kubernetesonarm

9 of 67

Ideal Scenario?

On-chip ROM

SPI

SD Card

USB

**u-boot SPL�< 1MB***

Boot Script

Device Tree

ARM stub / HW init

RAM initialized

CPUs set-up

“Boot files”

SoC

CPU

*kernel8.img

Closed Source

Open Source

kernel8.img

“De facto” standard,

common codebase

* Can be Linux, TianoCore EDK2 and/or u-boot�** Or Coreboot, Oreboot or UEFI PEI�*** Approximation. Board-specific

@kubernetesonarm

10 of 67

Hey! We’re now missing out on netbooting?!?

@kubernetesonarm

11 of 67

Yay, now addressed problem 1:�Correctness and openness of the boot binaries

@kubernetesonarm

12 of 67

Trusted Framework-A (TF-A) + OP-TEE

Similar to Intel SGX*; partitions the CPU into a “secure” and “non-secure” part.

OP-TEE is the “secure world” OSS impl.�(Has Rust and Go libraries ready)�

TF-A is packaged as the “ARM stub” that is run right before the main executable.

* However, it seems like TF-A implements a subset of SGX’ features

@kubernetesonarm

13 of 67

Add “secure world” with TF-A

On-chip ROM

SPI

SD Card

USB

u-boot SPL�< 1MB

Boot Script

Device Tree

TF-A

RAM initialized

CPUs set-up

“Boot files”

SoC

CPU

kernel8.img

Closed Source

Open Source

kernel8.img

“De facto” standard,

common codebase

“Non-secure”

“Secure”

OP-TEE impl

@kubernetesonarm

14 of 67

  • Secure the boot process
  • Securely net-booting the OS

Part 1: Edge Security

@kubernetesonarm

15 of 67

LinuxBoot

Idea: Replace UEFI or a “high-level” bootloader with a minimal Linux for better security, reproducibility and transparency.

=> Any Python developer can�now be a firmware developer��kexec: Change the running�kernel without rebooting

Only compile in what you need

@kubernetesonarm

16 of 67

Compare “traditional” UEFI vs LinuxBoot

Hardware�Initialization

Hardware�Initialization

LinuxBoot way

Fetch OS to boot

Fetch OS to boot

Run desired OS

Run desired OS

Traditional UEFI

UEFI PEI

@kubernetesonarm

17 of 67

u-root

The first/only process running in your LinuxBoot,

packaged inside an initramfs in the kernel.

Written in Go, OSS on GitHub. Include any external binary, or write your own boot logic in Go.��Provides a set of common boot logic written in a memory-safe language, e.g. provides a kexec method

@kubernetesonarm

18 of 67

Add in LinuxBoot, not u-boot or TianoCore

On-chip ROM

SPI

SD Card

USB

u-boot SPL�< 1MB

Boot Script

Device Tree

TF-A

RAM initialized

CPUs set-up

“Boot files”

SoC

CPU

µLinux

Closed Source

Open Source

*µLinux <16MB

“Non-secure”

“Secure”

OP-TEE impl

u-root

kexec

Linux

@kubernetesonarm

19 of 67

“A framework for securing software update systems”��A set of steps to follow to ensure software updates or artifacts are not (at least easily) compromised by attackers.��Used in e.g. “docker pull/push” (Notary) and Bottlerocket.

Graduated CNCF project.

@kubernetesonarm

20 of 67

The de-facto container format. Donated to the�Linux Foundation by Docker 2015. Now industry-standard.��Provides specifications for a container image (packaging), runtime (isolation) and registry (distribution).

Evolving into a generic artifact distribution mechanism with e.g. Deis’ ORAS.

@kubernetesonarm

21 of 67

LinuxBoot + u-root + OCI/ORAS = ociboot*

Idea: Want netbooting and a “modern” way to build the OS. Read: Build your OS image using a Dockerfile.��OCI provides image packaging/distribution.��With LinuxBoot + u-root one can write “firmware” in Go that downloads OCI images and kexec’s.��ORAS allows any image format (e.g. qcow2) instead of OCI, but still uses the OCI distribution part.

* This project still in idea stage, not implemented yet

@kubernetesonarm

22 of 67

ociboot workflow

Benefits:

  1. All complex “netboot” logic written in Go
  2. Re-use all device drivers from Linux
  3. Works regardless of compute type
  4. Notary used behind the scenes for verification

OCI pull

Optional Pull Secrets

OCI Registry

LinuxBoot

Extract OS image

RAM

Disk

kexec

Target OS

ociboot

@kubernetesonarm

23 of 67

LinuxBoot + u-root + TUF = tufboot*

ociboot would use Notary under the hood for OCI pull

However, Notary is only one impl. of TUF. TUF is flexible for many more (advanced) trust & delegation flows.��tufboot would download any supported artifact securely given a TUF Root of Trust JSON spec.

ociboot & tufboot can be integrated into u-root or webboot.

* This project still in idea stage, not implemented yet

@kubernetesonarm

24 of 67

tufboot workflow

Benefits:

  • All complex “netboot” logic written in Go
  • Re-use all device drivers from Linux
  • Works regardless of compute type
  • Advanced trust delegations possible

TUF download

TUF Root of Trust JSON

S3 bucket / HTTP(S) server

LinuxBoot

Extract OS image

RAM

Disk

kexec

Target OS

tufboot

@kubernetesonarm

25 of 67

Yay, now addressed problem 2:�Duplication of important drivers across bootloaders

@kubernetesonarm

26 of 67

  • Secure the boot process
  • Securely net-booting the OS
  • Remote Attestation

Part 1: Edge Security

@kubernetesonarm

27 of 67

Remote Attestation

Problem: How can I trust that a machine (e.g. on the edge) booted correctly and wasn’t tampered with?

In case the “cloud” part needs to give it high-privilege credentials for accessing sensitive resources/data, it should be confident that machine is safe first.��Remote Attestation is a way to solve this problem.

@kubernetesonarm

28 of 67

Trusted Platform Module (TPM)

“a dedicated microcontroller designed to secure hardware through integrated cryptographic keys.“��Can generate keys, sign/verify/encrypt/decrypt data, and store Platform Configuration Registers (PCR)

PCRs can only be extended, not set:

pcr[i] = hash(pcr[i] || extendArg)

@kubernetesonarm

29 of 67

Trusted Platform Module (TPM)

PCRs form a good way to seal, not just encrypt data.

Sealing means “encrypt with both a key and the PCRs”

In a conventional Static Root of Trust for Measurements (SRTM) flow, the PCR register is extended with the hash of the next boot executable before execution.��=> Only if all boot binaries are correct, unsealing can happen

@kubernetesonarm

30 of 67

Yay, now addressed problem 3:�Introspection of the boot flow

@kubernetesonarm

31 of 67

Remote Attestation

Nonce (random number) prevents replay attacks.

Nonce + PCRs => Quote��Quote is validated by server

Server can send secret back encrypted by the TPM key

Image from Simma, Armin. (2015). Trusting Your Cloud Provider. Protecting Private Virtual Machines.

@kubernetesonarm

32 of 67

SRTM

On-chip ROM

u-boot SPL

CPUs set-up

SoC

CPU

LinuxBoot

Closed Source

Open Source

kexec

Target Linux

TPM

PCR 0

1. Extend with u-boot SPL hash

2. Extend with LinuxBoot hash

1

2

3

3. Extend with Target Linux hash

App

Attestation Server

Random Nonce

4. Client ask for nonce and PCR list

4

5. Get signed PCR quote from TPM

5

EK

Reference

Quote

=

Sent

Quote

6

6. Validate sent quote matches reference

App Secret

7. Encrypt App Secret with EK

7

8

8. Decrypt App Secret with EK

@kubernetesonarm

33 of 67

Yay, now addressed problem 4:�Remote verification of an edge node being safe

@kubernetesonarm

34 of 67

Good resources on TPMs

Remote Attestation helper scripts, sealing keys using PCRs, and more https://safeboot.dev/

StackExchange answer on SRTM vs DRTM

google/go-attestation

@kubernetesonarm

35 of 67

  • Kubernetes Cluster Lifecycle

Part 2: Edge Automation

@kubernetesonarm

36 of 67

What’s complex with managing a cluster?

  • Reinventing the wheel for Kubernetes mgmt
    • Don’t write it all yourself. Use upstream, community-backed building blocks�
  • Interoperability between many different providers
    • Tasks like creating, upgrading and autoscaling clusters can be wildly different�
  • Declaratively controlling a fleet of edge clusters
    • Making sure a set of edge clusters stay in sync at all times is a challenge�
  • Keeping ingested edge data in sync with the cloud
    • How to deal with network interruptions of edge clusters without app modification

@kubernetesonarm

37 of 67

kubeadm

Control Plane 1

Control Plane N

Node 1

Node N

kubeadm

kubeadm

kubeadm

kubeadm

Cloud Provider

Load Balancers

Monitoring

Logging

Cluster API Spec

Cluster API

Cluster API Implementation

Addons

Kubernetes API

Bootstrapping

Machines

Infrastructure

= The official tool to bootstrap a minimum viable, best-practice Kubernetes cluster

Layer 2

kubeadm

Layer 3

Addon Operators

Layer 1

Cluster API

@kubernetesonarm

38 of 67

kubeadm vs an “end-to-end solution”

end-to-end solution

Control Plane 1

Control Plane N

Node 1

Node N

kubeadm

kubeadm

kubeadm

kubeadm

Cloud Provider

Load Balancers

Monitoring

Logging

Cluster API Spec

Cluster API

Cluster API Implementation

Addons

Kubernetes API

Bootstrapping

Machines

Infrastructure

kubeadm is built to be part of a higher-level solution

@kubernetesonarm

39 of 67

k3s

A kubeadm-like deployment mechanism where all Kubernetes components are integrated into one binary.

That combined with removal of some set of features not needed at the edge means small binary footprint.

CNCF sandbox project. More opinionated than kubeadm which means easier to get going with, but less extensibility.

@kubernetesonarm

40 of 67

Yay, now addressed problem 5:�Reinventing the wheel for Kubernetes mgmt

@kubernetesonarm

41 of 67

Cluster API

The next step after kubeadm

“To make the management of (X) clusters across (Y) providers simple, secure, and configurable.”

“How can I manage any number of clusters in a similar fashion to how I manage deployments in Kubernetes?”

@kubernetesonarm

42 of 67

Declarative clusters

  • With Kubernetes we manage our applications declaratively
    1. Why not for the cluster itself?
  • With the Cluster API, we can declaratively define the desired cluster state
    • Operator implementations reconcile the state
    • Use Spec & Status like the rest of k8s
    • Common management solutions for e.g. upgrades, autoscaling and repair
    • Allows for GitOps workflows

apiVersion: cluster.x-k8s.io/v1alpha4�kind: MachineDeployment�metadata:� name: "test-cluster-md-0"�spec:� clusterName: "test-cluster"� replicas: 3� template:� spec:� clusterName: "test-cluster"� version: v1.20.1� bootstrap:� configRef:� name: "test-cluster-md-0"� apiVersion: bootstrap.cluster.x-k8s.io/v1alpha4� kind: EKSConfigTemplate� infrastructureRef:� name: "test-cluster-md-0"� apiVersion: infrastructure.cluster.x-k8s.io/v1alpha4� kind: AWSMachineTemplate---�apiVersion: infrastructure.cluster.x-k8s.io/v1alpha4�kind: AWSMachineTemplate�metadata:� name: "test-cluster-md-0"�spec:� template:� spec:� instanceType: "standard-4vcpu-8gb"� iamInstanceProfile: "test-iam-profile"� sshKeyName: "my-personal-ssh-key"

@kubernetesonarm

43 of 67

Yay, now addressed problem 6:�Interoperability between many different providers

@kubernetesonarm

44 of 67

  • Kubernetes Cluster Lifecycle
  • Automate the edge with GitOps

Part 2: Edge Automation

@kubernetesonarm

45 of 67

GitOps: A cloud-native paradigm

GitOps, coined by Alexis Richardson, CEO of Weaveworks

Idea: Declaratively describe the desired state of all your infrastructure in a versioned backend like Git, and have controllers execute towards that state. Observe-diff-act

Allows for better reproducibility, drift detection,

mean-time-to-recovery, control, and more.

@kubernetesonarm

46 of 67

@kubernetesonarm

47 of 67

Flux: The GitOps Engine

The original GitOps implementation. Syncs desired state from Git, Helm charts or a S3 bucket to a Kubernetes cluster.��Extensible and contains many advanced features. CNCF incubating project, has a large community.��Large integration ecosystem (e.g. Cluster API, OPA)

@kubernetesonarm

48 of 67

kspan: Visualization of Kubernetes, GitOps

kspan listens to Kubernetes Events, and turns those into OpenTelemetry (CNCF sandbox project) spans.

This plays well with GitOps, as you can watch the lifecycle of your Kubernetes state being synced by Flux

Jaeger (CNCF graduated project) is a good visualization frontend for OpenTelemetry.

@kubernetesonarm

49 of 67

kspan: Visualization of Kubernetes, GitOps

@kubernetesonarm

50 of 67

Yay, now addressed problem 7:�Declaratively controlling a fleet of edge clusters

@kubernetesonarm

51 of 67

  • Kubernetes Cluster Lifecycle
  • Automate the edge with GitOps
  • Sync the edge and the cloud

Part 2: Edge Automation

@kubernetesonarm

52 of 67

KubeEdge

“an open source system for extending native containerized application orchestration capabilities to hosts at Edge”

Incubating CNCF project. Allows for discovery and data ingestion of MQTT-compliant devices and local HTTP APIs.

The edge part is configured by the cloud and syncs ingested edge data to the cloud whenever possible.��Another alternative to consider would be AKRI.

@kubernetesonarm

53 of 67

KubeEdge

Collect MQTT data

Control what is run on the edge

Connect to HTTP apps

Run “normal” containers on the Edge, e.g. AI inference tasks

Cache data and keep in sync when connectivity is bad

@kubernetesonarm

54 of 67

Impressive KubeEdge use-case

@kubernetesonarm

55 of 67

Yay, now addressed problem 8:�Keeping ingested edge data in sync with the cloud

@kubernetesonarm

56 of 67

What now? Implementation left as an exercise

Yes, and no. I’m working on Racklet, libgitops + more this summer to put all of these things together.��Watch this announcement ==>

If you’re interested, join the OSFW�Slack at https://slack.osfw.dev/�If interested in GitOps, email me�at lucas@weave.works :)

@kubernetesonarm

57 of 67

Thank you!

@luxas on Github

@luxas on Kubernetes’ Slack

@kubernetesonarm on Twitter

lucas@luxaslabs.com / lucas@weave.works

58 of 67

Appendix

Slides not included in the talk, but that are still relevant context

59 of 67

Example: Raspberry Pi 4

  1. On-chip bootloader that gets bootloader from EEPROM
    1. Cannot be modified in any way. Runs on the SoC (GPU).�
  2. EEPROM -> start.elf SoC/GPU bootloaders
    • The bootloader in the EEPROM can do e.g. SD Card, TFTP, USB, SSD, and NVMe booting to get start.elf, and auxiliary files => complex custom piece of code
    • start.elf can be thought of as the BIOS of the RPi, it’s configured through config.txt
    • Both files are proprietary firmware files for the SoC/GPU. Initializes the RAM.�
  3. Raw ARM64 binary to load into RAM

@kubernetesonarm

60 of 67

Problems with Raspberry Pi boot

  1. Proprietary EEPROM and start.elf bootloaders
    • Cannot know if the content is legitimate as it is not OSS�
  2. GPU has full access to RAM => can bypass CPU
    • Isolation features, exception levels, etc. have sadly no effect�
  3. No support for TPMs
    • Cannot do a “trusted boot chain” where the next step is measured before execution�
  4. EEPROM cannot easily be write-protected
    • A malicious user can gain persistence in the EEPROM

@kubernetesonarm

61 of 67

ARM UEFI compliance levels

ARM bootloader ecosystem is fragmented => push to standardize.

EBBR: For embedded devices, a lighter variant of SBBR. More or less what u-boot accomplishes.�

SBBR: For an “out-of-the-box” experience when booting various OSes. Requires UEFI+ACPI. Work in progress to provide a SBBR-compliant RPi 4 UEFI

@kubernetesonarm

62 of 67

TF-A architecture

@kubernetesonarm

63 of 67

Kubernetes’ high-level component architecture

Nodes

Control Plane

Node 3

OS

Container

Runtime

Kubelet

Networking

Node 2

OS

Container

Runtime

Kubelet

Networking

Node 1

OS

Container

Runtime

Kubelet

Networking

API Server (REST API)

Controller Manager

(Controller Loops)

Scheduler

(Bind Pod to Node)

etcd (key-value DB, SSOT)

User

@kubernetesonarm

64 of 67

Cluster API

The next step after kubeadm

“How do I manage other lifecycle events across that infrastructure (upgrades, deletions, etc.)?”

“How can we control all of this via a consistent API across providers?”

@kubernetesonarm

65 of 67

@kubernetesonarm

66 of 67

libgitops: Read/write objects in files in Git easily

“An ORM, a library written in Go, for Kubernetes-style API objects, stored in pluggable backends, most famously Git”

Flux: “compile” the desired declarative spec (DDS) to compiled declarative state (CDS), e.g. through API server into etcd.

Flagger/ctrl-runtime: Act and reconcile actual state (CDS).

libgitops controller: Reconcile actual state (CDS) to into a new desired spec (DDS).

@kubernetesonarm

67 of 67

libgitops enabling the “Future of GitOps” Vision

  1. Interface-driven encoders, decoders, versioners and recognizers in the system�
  2. Abstract Storage mechanism, target can be anywhere�
  3. Any Object can be managed by the system due to 1)�
  4. Generic Transaction model and engine built-in.�
  5. Lets user build e.g. GUIs

@kubernetesonarm