1 of 52

Software Engineering

for Machine Learning Systems

Week four: Deployment

Imperial DoC, Spring 2024

Andrew Eland

a.eland@imperial.ac.uk

CC BY-SA 4.0 (photos covered separately)

2 of 52

Deployment

3 of 52

4 of 52

5 of 52

6 of 52

Modern data centres

Historically, production data processing and serving systems ran on expensive, reliable, shared memory computers using proprietary operating systems, and a wide range of CPU architectures.

The increasing consumer market for commodity PCs caused their price to fall to a point at which they were dramatically more economical than traditional high end computers.

Commodity ethernet could be used to build clusters of these PCs. The inherent unreliability of clusters could be mitigated in software. This mitigation made clusters more reliable than high end systems, as life is often unreliable even if your high end system computer is.

andreweland.org/swemls a.eland@imperial.ac.uk

7 of 52

8 of 52

Managing clusters

The move to managing clusters of thousands of machines necessitated the development of better tools to manage those machines, and the processes running on them. Maintenance tasks that were previously manual now needed to be automated.

Clusters were originally designed for specific jobs, for example, indexing web pages (high throughput) or serving websites (low latency). As engineering teams grew, clusters needed to run an increasingly large mix of jobs.

Mixing jobs on the same machines required better isolation of the jobs from each other.

andreweland.org/swemls a.eland@imperial.ac.uk

9 of 52

Containerisation

UNIX has long isolated the memory of one process from another. A process attempting to read or write RAM used by another is killed with SIGSEGV.

Historically, it hasn’t provided workable isolation for filesystem access, networking, or dependency versioning. There is only one /usr/bin/python binary. You can only have one process listening on port 8000.

“It worked on my machine” becomes untenable with clusters of thousands of machines.

andreweland.org/swemls a.eland@imperial.ac.uk

10 of 52

Containerisation

To improve this, Linux introduced the clone() syscall. Like the traditional fork() syscall, it creates a new process, but gives fine-grained control of resource sharing between the parent and child processes, including the use of namespaces to isolate networks, filesystems and more.

The Linux kernel has no concept of containers. High level, user space, applications use the clone() syscall to start processes that are isolated from each other. Docker is one such application, but there are others.

andreweland.org/swemls a.eland@imperial.ac.uk

11 of 52

Docker

Docker is a user space application that provides a GUI and command line tools to start containers via clone().

It also provides tools to build a representation of the isolated filesystem to be used by a process, called an image, and copy these images to and from a remote server, called a registry. Registries allow distribution of images to other machines.

Docker (the company) operate a default registry, which helped kickstart an ecosystem of people sharing images for running well known applications. Public cloud providers offer alternatives. You can also run your own (they’re basically fancy web servers).

andreweland.org/swemls a.eland@imperial.ac.uk

12 of 52

Docker

Docker also provides a lot of magic. If you start a container using an image containing Linux binaries on OSX or Windows, it will use a hypervisor/virtual machine to (kind of) run the Linux kernel. It can be encouraged to emulate different CPUs too.

This complexity comes with a cost of making debugging difficult.

andreweland.org/swemls a.eland@imperial.ac.uk

13 of 52

Building Docker images

FROM ubuntu:jammy

RUN apt-get update && apt-get -yq install python3

COPY simulator.py /simulator/

COPY simulator_test.py /simulator/

WORKDIR /simulator

RUN ./simulator_test.py

COPY messages.mllp /data/

EXPOSE 8440

EXPOSE 8441

CMD /simulator/simulator.py --messages=/data/messages.mllp

Dockerfile is a script that tells docker how to build an image.

andreweland.org/swemls a.eland@imperial.ac.uk

14 of 52

Building Docker images

FROM ubuntu:jammy

Create a new image, copying the initial contents of the image from an image called ubuntu with tag (basically a named version) jammy.

If the machine building the new image doesn’t have an image called ubuntu, it checks to see whether it’s default registry, docker hub.

In this case, the end result is that our image contains a fresh install of the ubuntu Linux distribution, without any packages installed.

andreweland.org/swemls a.eland@imperial.ac.uk

15 of 52

Building Docker images

FROM ubuntu:jammy

RUN apt-get update && apt-get -yq install python3

Starts a new container, using the image as it currently stands for its filesystem, and run the given command within it. Any modifications made to that filesystem become part of the image for future commands.

In this case, the end result is that we install python into our image.

andreweland.org/swemls a.eland@imperial.ac.uk

16 of 52

Building Docker images

FROM ubuntu:jammy

RUN apt-get update && apt-get -yq install python3

COPY simulator.py /simulator/

COPY simulator_test.py /simulator/

Copy simulator.py and simulator_test.py from the host filesystem, specifically the context directory, into the image.

andreweland.org/swemls a.eland@imperial.ac.uk

17 of 52

Building Docker images

FROM ubuntu:jammy

RUN apt-get update && apt-get -yq install python3

COPY simulator.py /simulator/

COPY simulator_test.py /simulator/

WORKDIR /simulator

RUN ./simulator_test.py

COPY messages.mllp /data/

EXPOSE 8440

EXPOSE 8441

Confusingly does nothing anymore, but is generally used as documentation that the container will use ports 8440 and 8441.

andreweland.org/swemls a.eland@imperial.ac.uk

18 of 52

Building Docker images

FROM ubuntu:jammy

RUN apt-get update && apt-get -yq install python3

COPY simulator.py /simulator/

COPY simulator_test.py /simulator/

WORKDIR /simulator

RUN ./simulator_test.py

COPY messages.mllp /data/

EXPOSE 8440

EXPOSE 8441

CMD /simulator/simulator.py --messages=/data/messages.mllp

Specifies the command to run by default when a container is created using this image. Stored as metadata within the image.

andreweland.org/swemls a.eland@imperial.ac.uk

19 of 52

Building Docker images

% docker build -t simulator .

Build a docker image using the Dockerfile in the current directory, and using the current directory as the context (that’s the trailing dot). Calls the resulting image simulator.

andreweland.org/swemls a.eland@imperial.ac.uk

20 of 52

Running Docker images

% docker run simulator

Create a new container, using the image simulator as the filesystem, and run the command specified in the image metadata. Allocates an ID automatically.

% docker ps

CONTAINER ID IMAGE COMMAND

28595bf24226 simulator "/bin/sh -c '/simula…"

Shows the currently running containers

% docker stop 28595bf24226

Stops a running container

andreweland.org/swemls a.eland@imperial.ac.uk

21 of 52

Communicating with containers

The whole point of containers is to provide isolation, but if they’re entirely isolated, the can’t do any useful work.

There are many ways of communicating with containers. We’ll look a three basic mechanisms: signals, shared filesystems and the network.

andreweland.org/swemls a.eland@imperial.ac.uk

22 of 52

Signals

The most basic form of UNIX communication. Interrupts the control flow of a running process with a numbered signal. Used by the kill command to stop a process, but can be used for other purposes, for example, triggering the reload of configuration files.

% kill -l

HUP INT QUIT ILL TRAP ABRT EMT FPE KILL BUS SEGV SYS PIPE ALRM TERM URG STOP TSTP CONT CHLD TTIN TTOU IO XCPU XFSZ VTALRM PROF WINCH INFO USR1 USR2

andreweland.org/swemls a.eland@imperial.ac.uk

23 of 52

Signals

% docker stop 28595bf24226

Sends TERM, waits for a grace period, then sends KILL.

andreweland.org/swemls a.eland@imperial.ac.uk

24 of 52

Filesystems

Docker can bind a directory from the host filesystem into the isolated filesystem used by the container, allowing the host and the container to share files. Changes made by either the host or the container are reflected in the other.

% docker run -v ~/coursework3:/data simulator

Create a new container, using the image simulator as the filesystem, and run the command specified in the image metadata. The host directory ~/coursework3 is bound into the container as /data.

andreweland.org/swemls a.eland@imperial.ac.uk

25 of 52

Networking

By default, the container is given a new IP address. Docker can route traffic sent to ports on the host machine to ports on the container’s new IP address.

% docker run -p 9440:8440 -p 9441:8441 simulator

Connections to port 9440 on the host machine will be routed to port 8440 within the container. For example, this would enable:

% ./coursework3.py --mllp=9440 --pager=9441

andreweland.org/swemls a.eland@imperial.ac.uk

26 of 52

Networking

To communicate in the other direction, connections from within the container to host.docker.internal are routed to the host machine. For example, with a dockerised solution for coursework 3, this would enable:

% docker run \

--env MLLP_ADDRESS=host.docker.internal:8440 \

--env PAGER_ADDRESS=host.docker.internal:8441 \

coursework3

andreweland.org/swemls a.eland@imperial.ac.uk

27 of 52

Image registries

Image registries permit the distribution of build images. They’re remote servers that maintain a set of images. Having build an image, you can push it to a registry. You can then pull the image on another machine.

By default, Docker pushes images to a repository run by Docker (the company). You can push an image to a different registry by prefixing the image’s tag with a hostname followed by a slash.

% docker tag simulator swemls.azurecr.io/simulator

% docker push swemls.azurecr.io/simulator

andreweland.org/swemls a.eland@imperial.ac.uk

28 of 52

29 of 52

Kubernetes

A system for managing deployment and scaling of containers across clusters of machines. Initially developed by Google, based on its experience of building and running borg, it’s internal cluster management system.

andreweland.org/swemls a.eland@imperial.ac.uk

30 of 52

Pods and Nodes

Containerised application 1

Image resources

Pod

Containerised application 2

Image resources

Pod

Node 1

Containerised application 3

Image resources

Pod

Node 2

Containerised application 3

Image resources

Pod

Node 3

andreweland.org/swemls a.eland@imperial.ac.uk

31 of 52

Pods and Nodes

A Pod is the smallest unit of compute that Kubernetes handles. They’re collections of one or more containers sharing storage and networking.

Kubernetes runs pods by placing them a machine, which it calls a Node. Clusters normally have a number, and possibly thousands, of nodes.

Pods are disposable. They exist until the containers within it finish execution or crash. Kubernetes can also evict a pod if it needs resources, or if the node on which it’s running fails.

Often pods aren’t created explicitly. They’re managed by higher level Kubernetes resources that, for example, replicate a container to handle increased load.

andreweland.org/swemls a.eland@imperial.ac.uk

32 of 52

Deployments

Deployment 1, replicas: 2

Containerised application 3

Image resources

Pod

Containerised application 3

Image resources

Pod

andreweland.org/swemls a.eland@imperial.ac.uk

33 of 52

Deployments

A deployment describes the desired state of a set of pods. You specify a template on which to base the configuration of individual pods, together with how many instantiations of that template you’d like (called replicas). The kubernetes controller will attempt meet the desired state by starting (or stopping) pods.

andreweland.org/swemls a.eland@imperial.ac.uk

34 of 52

Services

Containerised application 1

Image resources

Pod

Containerised application 2

Image resources

Pod

Node 1

Containerised application 3

Image resources

Pod

Node 2

Containerised application 3

Image resources

Pod

Node 3

Port 8000: Containerised application 1 port 8000

Service 1

andreweland.org/swemls a.eland@imperial.ac.uk

35 of 52

Services

As pods are disposable, the IP address for a given workload will change over time, as the pods implementing it are created and destroyed. This makes it difficult for one pod to find and communicate with another.

Services provide a stable IP address and port number that kubernetes routes back to a pod providing a given service.

andreweland.org/swemls a.eland@imperial.ac.uk

36 of 52

Namespaces

Containerised application 2

Node 1

Node 2

Node 3

Port 8000: Containerised application 1 port 8000

Service 1

Namespace 1

Containerised application 2

Image resources

Pod

andreweland.org/swemls a.eland@imperial.ac.uk

37 of 52

Namespaces

Pods, Nodes, Deployments and Services are Kubernetes Resources. Resources belong to a Namespace, which prevents the identifiers used from conflicting with those defined in other Namespaces.

Namespaces are also the level at which access control is configured.

andreweland.org/swemls a.eland@imperial.ac.uk

38 of 52

Configuration

Pods, Nodes, Deployments, Services and Namespaces are all Kubernetes resources. Each resource is (typically) defined using a YAML file, applied to the cluster with the kubectl command.

Configuration is eventually consistent. Unless the definition itself contains errors, kubectl will immediately return successfully, and the cluster scheduler will attempt to make the state of the cluster match your desired state. It may never be possible to reach that state.

andreweland.org/swemls a.eland@imperial.ac.uk

39 of 52

Configuration

spec:

containers:

- name: simulator

image: imperialswemlsspring2024.azurecr.io/simulator

command: ["/simulator/simulator.py"]

args: ["--messages=/data/messages.mllp"]

ports:

- name: mllp

containerPort: 8440

- name: pager

containerPort: 8441

andreweland.org/swemls a.eland@imperial.ac.uk

40 of 52

Configuration

apiVersion: apps/v1

kind: Deployment

metadata:

name: simulator

spec:

replicas: 1

selector:

matchLabels:

app: simulator

template:

metadata:

labels:

app: simulator

spec: see previous slide

andreweland.org/swemls a.eland@imperial.ac.uk

41 of 52

Configuration

% kubectl apply -f simulator.yaml

deployment.apps/simulator created

% kubectl get pods

NAME READY STATUS simulator-6d98d6b774-r9km2 0/1 ContainerCreating

% kubectl get pods

NAME READY STATUS simulator-6d98d6b774-r9km2 1/1 Running

% kubectl get deployments

NAME READY UP-TO-DATE AVAILABLE AGE

simulator 1/1 1 1 3s

andreweland.org/swemls a.eland@imperial.ac.uk

42 of 52

43 of 52

44 of 52

Coursework four:

Running inference on Kubernetes

45 of 52

Coursework overview

Training a model

Inference design document

Building inference

Running inference on Kubernetes

Adding monitoring

Keeping the service alive

Week 1

2

3

4

5

6 & 7

andreweland.org/swemls a.eland@imperial.ac.uk

46 of 52

Running inference on Kubernetes

We’ll deploy the application you’ve written for coursework 3 on a Kubernetes cluster running on Azure.

You write a single configuration file, and apply it

You have to redesign your system to handle failure

Best case

Worst case

andreweland.org/swemls a.eland@imperial.ac.uk

47 of 52

Guaranteed failures

The simulator will unexpectedly close it’s MLLP connection with you.

You won’t be able to resolve the DNS name when trying to connect to the simulator.

Kubernetes will shutdown unexpectedly the pod running your solution by sending it SIGTERM, and restart it elsewhere.

andreweland.org/swemls a.eland@imperial.ac.uk

48 of 52

Unlikely failures

One of the machines acting as a Kubernetes node will fail, and your solution will disappear without warning.

andreweland.org/swemls a.eland@imperial.ac.uk

49 of 52

Graceful shutdown

while i < len(messages) and not shutdown_mllp.is_set():

try:

mllp = bytes(chr(MLLP_START_OF_BLOCK), "ascii")

mllp += messages[i]

mllp += bytes(chr(MLLP_END_OF_BLOCK), “ascii”)

mllp += bytes(chr(MLLP_CARRIAGE_RETURN), “ascii”)

client.sendall(mllp)

andreweland.org/swemls a.eland@imperial.ac.uk

50 of 52

Graceful shutdown

def main():

actual application code

def shutdown():

shutdown_event.set()

print("pager: graceful shutdown")

pager.shutdown()

signal.signal(signal.SIGTERM, lambda *args: shutdown())

andreweland.org/swemls a.eland@imperial.ac.uk

51 of 52

Marking

Consume some simulator messages on Kubernetes

100%

52 of 52

Good luck. See you Friday.

a.eland@imperial.ac.uk

andreweland.org/swemls